Parallel Computations

Parallel Computation - running the "pfea" program interactively

Introduction

The modules for the finite element solution using an adaptive mesh for the dam seepage problem are written in FORTRAN 77 using the MPI communication library. The 'pfea' program will be run on the Data Star, by following the instructions given below.
If you are interested in more information regarding the use of the Data Star, it's recommended that you read the Data Star User Guide.

How to start

These programs can be run in Linux environment. To start, you have two options:
- To use Linux machines. You have to start a terminal.
- To use the Windows NT machines. You have to use the "SSH" program. To start SSH, start from
Start-->Programs-->Network Applications-->SSH--->Secure Shell Client
Type "linux" as Host Name and click Connect. Don't be impatient if it doesn't immediately connect. Enter your CADLAB login and password. This will make a connection to one of the Linux machines in Engineering I. Now you are on your account via a Unix interface.

Login on the Data Star

To login on the Data Star you must use the secure shell command, in LINUX environment:
        % ssh -lUserID dslogin.sdsc.edu
Once you login the first time, you must change your password:
        % rlogin tfpasswd
        % passwd

Transfer the files from the HPSS [High Performance Storage System] to the Data Star

To transfer the "fea" modules and auxiliary files to the Data Star, you have to use "pftp". "pftp" is a ftp-like interface to HPSS. The following links and text below describe how to use the "pftp" utility:
        % pftp 
    pftp> cd /users/csb/u4078 
    pftp> get fea.tar 
    pftp> quit 
        %   

Compiling FORTRAN programs on the Data Star

In order to compile and run the pfea module, you have to untar the file fea.tar first:
        %tar xvf fea.tar
In the directory fea, the following files will be present:
        - Makefilepfea
        - Makefilegen
        - fea.f
        - node.f
        - minband.f
        - genmesh.f
        - decom.f
        - plotfea.f
To compile the fea module, change to the fea directory (type "ls" to see what is inside the directory). To run make we have to have a file named Makefile in the directory. However, there are two types of Makefile in this directory, each for a different program. For this we have to run make for each in turn. So, we have to create a Makefile using one first, and run make, then create the other Makefile(overwriting the first) and run make again. The purpose of the make utility is to determine automatically which pieces of a large program need to be recompiled, and issue the commands to compile them.(you can learn more about any command by typing "man command_name"):
        % cd fea
        % ls
        % cp Makefilepfea Makefile
        % make
        % cp Makefilegen Makefile  
        overwrite Makefile? y
        % make 
These will create a new executable "pfea" file by using fea.f and node.f and a "gen" file by using minband.f, genmesh.f and decom.f.

Running programs on the Data Star

To run the 'etch' module you must be logged on into the so-called interactive nodes:
        % ssh -lUserID  dspoe.sdsc.edu
These are the nodes that have been set up for interactive use. There are 4 interactive nodes. Each has 8 POWER4 CPUs and 2 GB of memory. Interactive access to the nodes is shared - at times, there may be more than one user job running on a node, for this, there may be a significant run-time variability.

To run the adaptive mesh finite element program, a user must generate the initial mesh first, then run the finite element program with an error estimate. With the error estimate, a new refined mesh will be created for another finite element analysis. This process is repeated until the error from finite element analysis is satisfied. The program "gen" is used to create the initial mesh and perform mesh refinement. The gen also performs the domain decomposition and generates data for the parallel finite element analysis. The program "pfea" is a parallel finite element analysis program that performs finite element analysis, using parallel computing.
To generate the initial mesh, we run "gen" first. This program will ask for some inputs:
        % gen
        1) create mesh
        2) refine mesh
        1
        what is the desired area?
        your_desired_area
        how many subdomains?
        your_number_of_subdomains
        write ploting file (1=yes,0=no)?
        your_choice        
After we have the initial mesh, input files for the parallel finite element program are prepared. (You can note the new files "da1,da2 etc., datfield, neib and if you have chosen to write a plotting file, mesh1, mesh2 etc.)
We can now run the 'pfea' module using these input files. The command for this use is "poe".
The "poe" command invokes the Parallel Operating Environment for loading and executing programs on remote processor nodes. The operation of POE is influenced by a number of POE environment variables. Here are the explanations of the environmental variables that we are going to use, obtained by typing "man poe". Each flag (stating with -) is connected to the environmental variable starting with MP_ and followed by the name of the flag and sets them temporarily during the execution:
     
     MP_NODES
     Specifies the number of physical nodes on which to run the parallel
     tasks. It may be used alone or in conjunction with MP_TASKS_PER_NODE
     and/or MP_PROCS. The value of this environment variable can be 
     overridden using the -nodes flag.

     MP_TASKS_PER_NODE
     Specifies the number of tasks to be run on each of the physical nodes.
     It may be used in conjunction with MP_NODES and/or MP_PROCS,
     but may not be used alone. The value of this environment variable 
     can be overridden using the -tasks_per_node flag.    

     MP_RMPOOL
     Determines the name or number of the pool that should be used for
     non-specific node allocation. This environment variable/command-line
     flag only applies to LoadLeveler. Valid values are any identifying
     pool name or number. There is no default. The value of this
     environment variable can be overridden using the -rmpool flag.

     MP_EUILIB
     Determines the communication subsystem library implementation to use
     for communication - either the IP communication subsystem or the US
     communication subsystem. In order to use the US communication
     subsystem, you must have an SP system configured with its high
     performance switch feature. Valid, case-sensitive, values are ip (for
     the IP communication subsystem) or us (for the US communication
     subsystem). The value of this environment variable can be overridden
     using the -euilib flag.
     
     MP_EUIDEVICE
     Determines the adapter set to use for message passing. Valid values
     are en0 (for Ethernet), fi0 (for FDDI), tr0 (for token-ring), css0
     (for the SP system's high performance switch feature), and csss (for
     the SP switch 2 high performance adapter).
In our case, we use the following syntax:
        % poe pfea -nodes number_of_nodes -tasks_per_node number_of_tasks -rmpool 1 -euilib ip -euidevice en0
The number of processors used are equal to the product of number_of_nodes and tasks_per_node and should be equal to the number of subdomains specified for the mesh generation. If the number of processors are more than the number of subdomains, you get an error message, if the number of processors are less than the number of subdomains, you don't get an error message, however, some of the subdomains are left uncalculated.
The number of processors can be maximum 32. Following combinations are recommended:
4 processors -> -nodes 1 -tasks_per_node 4
8 processors -> -nodes 1 -tasks_per_node 8
16 processors -> -nodes 2 -tasks_per_node 8
32 processors -> -nodes 4 -tasks_per_node 8

This program will also ask for some inputs: "The relaxation factor" for the interface relaxation method and "etamax" which is the desired percentage error (thus 0.05 should be the input for 5% desired error)
After the execution, the "pfea" program will also create some data files, such as re1, re2, etc. for its own use, and the out.dat where the parameters and results of the program is written. This is an ASCII text file and can be viewed with "more" and "cat" commands or any of the editors, and will contain:
       % more out.dat

       ==========================================
       no. of proc.: 
       relaxation factor: 
       number of nodes per processor: 
       total ite # : 
       max sor time: 
       total error^2 is :  
       total q^2 is :  
       calculated percentage error is :  
       desired percentage error is :  
       effective index theta is : 
If the calculated percentage error is bigger than the desired percentage error that you input, then the mesh should be refined to give a smaller error. For this, run "gen" again and choose "2) refine mesh" this time. You can change the number of subdomains (provided that you will run pfea with the new number of processors accordingly), however, you are not asked for the desired area anymore since it's required to generate mesh only.
Now that the mesh is refined, a new finite element analysis can be performed by running "pfea" again as described above. Note here, it would be good to keep the relaxation factor and etamax the same. Comparing the calculated error with the desired error, one can repeat refining the mesh and obtain more accurate finite element analysis output until the desired percentage error is satisfied.
Plotting the graphs
The 'pfea' program outputs a set of files named as " da1, da2, etc." Each file is the output data obtained from one processor.
The 'gen' program also outputs a set of files named as " mesh1, mesh2, etc.". Each file is the data of the mesh of each processor. The plotfea.f program interprets these data files and output as new files which are in a plottable format.
First, compile this program by typing
   
        % f77 -o plot plotfea.f
Successive compilation will generate the executable plot file.
Running the "plot" program
The "plot" program can use both value data files (e.g. da1, da2, etc.) and mesh data files (e.g. mesh1, mesh2, etc.). It can give output in one of the two plot formats: Gnuplot or Matlab
Once the program is run, it will require some input such as:
        % plot
        please choose format of output:gnuplot(g)/matlab(m)
and then
  
        What is the file to be plotted?
        Mesh map(m)/Value plot(v) 
Mesh map files are mesh1, mesh2 etc. and value plot files are da1, da2, etc. After inputting these, the program asks
        how many files do you want to convert?
Since the number of files are equal to the number of subdomains and processors, it would be hard to run the program for each file. The program allows to convert multiple files in a single run. If the number of files to be converted is not 1, it requires another input:
        do you want to combine these files in one file?y/n 
Since each file contains only one subdomain, if we wanted to see the whole domain we should combine the files. If combine is chosen, it outputs a file that is the combination of the inputted subdomains.
One thing to note is that, "mesh1, mesh2, etc." files are already in Gnuplot format and they can be combined by the command "cat". To obtain a combination of all the mesh subdomains (4 in this case):
 
        % cat mesh1 mesh2 mesh3 mesh4 > your_file_name
Plotting the graphs in Gnuplot
Gnuplot is an interactive plotting software available on UNIX. If you choose to use Gnuplot, then follow the following steps to obtain the graphs.
First run the "plot" program and obtain the Gnuplot type plottable output files. Transfer the files to a UCSB machine(into your X:\ directory), using the SCP utility:
To copy the file "pi.c" from a local machine to the Data Star:

          scp pi.c u4078@dslogin.sdsc.edu:pi.c

To copy the file pi.c from Data Star to the Engineering domain machines:

          scp pi.c stefan@linux.engr.ucsb.edu:pi.c

which will copy the "pi.c" file to the machines both in the "ENGR" domain.
["u4078" is the LoginID on Data Star and "stefan" is the LoginID on the
Engineering domain machines]
After, completing this you can exit the Data Star by typing "exit" or "logout". Now that you are on your engineering account, start the Gnuplot, by typing.
        % gnuplot 
Following are the commands on Gnuplot software to obtain the file in the Postscript format for printing.
        gnuplot> set term postscript
        gnuplot> set output 'your_output_name.ps' 
Note that you should set new output names for each file you want to plot, otherwise it will keep overwriting the same file. Here are a few commands to make your plots look neater(use these before plot command):
 
        gnuplot> set title 'title_of_your_plot'
        gnuplot> set xlabel 'x-axis'
        gnuplot> set ylabel 'y-axis'
For the output of value data files:
        gnuplot> splot 'your_file_name' w l
For the output of mesh data files:
        gnuplot> plot 'your_file_name' w l 
To quit Gnuplot:
        gnuplot> quit
Now you can access your "*.ps" files through a File Manager, and use the available software for Postscript format (e.g. GhostView) to print your plots.
Plotting the graphs in Matlab
Matlab is a powerful mathematical software which also has a nice plotting interface. To plot your graphs in Matlab you should use the Matlab option of the plot program, and the type name of your files should be "*.m" .
(a tip about matlab files, file names starting with "mesh" e.g. "mesh1.m" and files starting with numbers e.g. "1.m" will be understood as a command(bad command in fact) and will give error message just because of the name).
Once you obtain your output files (in Matlab format), transfer the files to a UCSB machine(into your X:\ directory), using the SCP utility:
To copy the file "pi.c" from a local machine to the Data Star:

          scp pi.c u4078@dslogin.sdsc.edu:pi.c

To copy the file pi.c from Data Star to the Engineering domain machines:

          scp pi.c stefan@linux.engr.ucsb.edu:pi.c

which will copy the "pi.c" file to the machines both in the "ENGR" domain.
["u4078" is the LoginID on Data Star and "stefan" is the LoginID on the
Engineering domain machines]
and exit Data Star by typing "exit" or "logout".
On Windows, you can start Matlab in the conventional way.
On Linux, you can use Matlab with Xwindow by
        % matlab -display your_computer_name:0.0    (your_computer_name e.g. ecipc004.engr.ucsb.edu)
or by setting the display to your computer first and then running Matlab:
        % setenv DISPLAY your_computer_name:0.0    (your_computer_name e.g. ecipc004.engr.ucsb.edu)
        % matlab
Once you access Matlab, open your files and run them by either pressing F5 or from the menus
Debug --> (Save and) Run
Print your plots using the Matlab figure window.