To show that dynamic LibPFASST works in more complex settings, dynamic resources are applied to a 2D heat equation solver that works both parallely in time and parallely in space.
The original solver was part of the LibPFASST repository (Examples/hypre
). This modified version is available separately on Github.
The following video demonstrates how the heat equation is being solved and how process set operations are being used to dynamically grow and shrink the solver in the time dimension:
The heat equation is solved on a 2D square in the domain [0,Ï€]x[0,Ï€] with a zero boundary condition. The initial condition is described by the function u(x,y,0) = sin(x) + sin(y) which creates a "bump" on the square domain. A closed form solution is given by u(x,y,t) = exp(-2Ï€t) (sin(x) + sin(y)). This makes it possible to compute the error in the solution created by LibPFASST.
Additionally to parallelizing in time using LibPFASST, the domain is split up in a grid wise manner. For example, the space domain can be split up into 4,9,16,... smaller squares. This is implemented by running 4,9,16,... parallel PFASST instances where the sweeper will use communication across the space domain to compute the time derivatives.
In the non-dynamic case, this communication can be hidden transparently from LibPFASST.
The main routine creates a grid out of MPI_COMM_WORLD
using MPI_Comm_split
so that each process is part of exactly one time and one space communicator.
The time communicator is then used as the main LibPFASST communicator and the space communicator is used for solving the space component of the equation using the HYPRE library.
Because resize operations work on process sets, we cannot rely on MPI_Comm_split
but instead need to split up the process sets directly.
This is done in the following fashion (implemented in create_pset_grid
in src/comm.f90
):
Here we need the order preservation assumption for creating an consistent grid (the indexing in the third step must work).
Because process set operations need global coordination, the fact that multiple PFASST instances are running in parallel cannot be transparently hidden from dynamic LibPFASST.
Instead, LibPFASST requires additional information to be able to resize correctly. This way of running LibPFASST is called "space-parallel mode".
It can be enabled by passing additional global_pset
and horizontal_pset
to the pf_dynprocs_create
routine.
This will set the respective attributes of pf%dynprocs
and create the respecive communicators.
Furthermore, the resize operations become more complex. The logic is implemented in the following subroutines: pf_dynprocs_create_comm
(for joining an existing run), pf_dynprocs_handle_growth_global
(for shrinking by a number of time steps) and pf_dynprocs_handle_shrink_global
(for growing by a number of time steps and establish communication with new processes).
Again, we need the order preservation assumption for this to work and to keep grid consistency.
These implement resizing in the following fashion (see also the video at the top):
Grow:
PSETOP_GROW
request with ntime * num_new_timesteps
as the requested size.MPI_Sessions_dyn_v2a_query_psetop
.Shrink:
PSETOP_UNION
operation across the last num_timesteps_to_remove
space process setsnspace
PSETOP_DIFF
operations for each time process set to get the new time process setsPSETOP_SHRINK
request with ntime * num_timesteps_to_remove
as the requested size. Pass the delta process set as an argument, so the runtime only removes these processes.MPI_Sessions_dyn_v2a_query_psetop
.Note: If you have followed the instructions for the Docker build in the Open MPI section, the showcase was already cloned and compiled at /opt/hpc/build/showcase_dyn_libpfasst
.
To compile this program, make sure that you have built the dynamic version of LibPFASST that can be found here.
Then run the following commands:
# clone the repository
git clone https://github.com/boi4/showcase_dyn_libpfasst.git && cd showcase_dyn_libpfasst
# clone hypre
git clone git clone https://github.com/hypre-space/hypre.git
# build hypre
cd hypre/src && ./configure --disable-fortran && make -j && cd ../..
# finally, compile this project
make LIBPFASST=/path/to/LibPFASST/
Please refer to the LibPFASST documentation to see what parameters you can set in probin.nml.
The following additional parameters can control the run:
dump_values <- logical, whether to dump solution values after each block
dump_dir <- string, where to dump the values to
nspace <- integer, number of processes per time step, must be a square number
T0 <- float, t0
TFin <- float, tfin
nsteps <- integer, number of timesteps
Note that when nsteps/(TFin-T0) goes below some threshold (seems to be around ~1), the algorithm will become numerically unstable and a division by zero may appear.
You can run the solver with the following command:
mpirun <YOUR MPI RUN ARGUMENTS HERE> ./main.exe probin.nml
Because of the resize granularity restriction of the runtime, make sure to have the number of processes per node be equal to the number of processes per time step (nspace
).
If you followed the Docker setup as described in the Open MPI section, you can pass the following arguments to mpirun (for nspace=4):
mpirun --mca btl_tcp_if_include eth0 --host n01:4,n02:4,n03:4,n04:4,n05:4,n06:4,n07:4,n08:4 -np 16 ./main.exe probin.nml