Tools

Debugging parallel programs is hard. Especially MPI programs.

Debugging dynamic MPI programs is even harder. That is why I created multiple tools to facilitate the development and debugging of dynamic applications.

tmpi.py

tmpi.py allows you to interact with each process of an MPI run in a different tmux (a terminal multiplexer) pane. This is very useful in terminal-only environments (ssh/docker/...) where typical tricks like starting multiple instances of a terminal emulator do not work.

The source code is available on Github: https://github.com/boi4/tmpi-py

It is is a python rewrite of tmpi with the following benefits:

It has the following disadvantages:

Dynamic Applications

When running applications using dynamic Open MPI, tmpi.py will respond to resource changes in the following fashion:

Dependencies (installed on each MPI host)

Further requirements

Installation

Just copy the tmpi.py script somewhere in your PATH. If you run MPI on multiple hosts, the tmpi.py script must be available at the same location on each host.

Full usage

./tmpi.py [number of initial processes] COMMAND ARG1 ...

You need to pass at least two arguments. The first argument is the number of processes to use, every argument after that is the commandline to run.

If the environment variable TMPI_REMAIN=true, the new window is set to remain on exit and has to be closed manually. ("C-b + &" by default)

You can pass additional 'mpirun' argument via the MPIRUNARGS environment variable

You can use the environment variable TMPI_TMUX_OPTIONS to pass options to the tmux invocation, such as TMPI_TMUX_OPTIONS='-f ~/.tmux.conf.tmpi' to use a special tmux configuration for tmpi.

Little usage hint: By default the panes in the window are synchronized. If you wish to work only with one process without distraction, maximize the corresponding pane pane ("C-b + z" by default). Return to the global view using the same shortcut.

Examples

Parallel debugging with GDB:

tmpi.py 4 gdb executable

It is advisable to run gdb with a script (e.g. script.gdb) so you can use

tmpi.py 4 gdb -x script.gdb executable

If you have a lot of processors you want to have set pagination off and add the -q argument to gdb:

tmpi.py 4 gdb -q -x script.gdb executable

This avoids pagination and the output of the copyright of gdb, which can be a nuissance when you have very small tmux panes.

A more complicated tmpi.py command might look like this:

MPIRUNARGS='--mca btl_tcp_if_include eth0 --host n01:4,n02:4,n03:4,n04:4,n05:4,n06:4,n07:4,n08:4' \
    TMPI_REMAIN="true" \
        tmpi.py 32 \
        gdb -q \
                -ex "set pagination off" \
                -ex "set breakpoint pending on" \
                -ex "b _gfortran_runtime_error_at" \
                -ex "b ompi_errhandler_invoke" \
                -ex "b myfile.f90:1337" \
                -ex r \
                -ex q \
                --args \
                ./executable arg1 arg2 ...

Here, tmpi.py will run 32 MPI processes on 8 hosts in parallel with GDB attached to each of them. Also, GDB will break on Fortran and Open MPI errors and on a custom user breakpoint. Note that the -ex commands could also be put into a script file.

Keybindings

In general, the keybindings from tmux apply. The most useful ones are the following:

Screenshots


tmpi.py showing 16 gdb processes debugging an MPI application.
tmpi.py running an application that grows and shrinks dynamically on up to 4 hosts.


DynVis


It can be useful to visualize a dynamic MPI run, to be able to retrospectively figure out which Process Set events happened. It is usually quite difficult to figure this out own your own just by looking at print statements in the terminal (although tmpi.py already drastically improves this situation).

Therefore, a log file format was designed and accompanying log file visualizer called DynVis was implemented. The source code of DynVis is available on GitHub.

Logging format (v1)

A v1 log file of a dynamic MPI run is a CSV file containing lines in the following format:

unixtimestampmilis (integer), #job id (integer), action (string), action data (string, single line JSON, escaped with double quotes)

Where unixtimestampmilis is the number of miliseconds since epoch, job_id is a unique identifier for the MPI job that was started and action is one of the unique actions below and the final json column provides some additional information about the action.

Additionally, we allow for blank lines, and lines starting with a pound sign (#), which are both ignored while parsing.

Available actions:

ActionDescriptionExample JSON
job_startA new job is started."{""job_id"" : 0}"
job_endA job has finished."{""job_id"" : 0}"
new_psetA new process set is announced. Needs to be done before any other interaction with that pset."{""proc_ids"" : [0,1,2,3,4,5,6,7], ""id"": ""mpi://world_0""}"
set_startA new process set is started (initial start or an add/grow)."{""set_id"" : ""mpi://world_0""}"
process_startA new process has started."{""proc_id"" : 0}"
process_shutdownA process has shutdown."{""proc_id"" : 0}"
psetopA process set operation has been applied by the runtime."{""initialized_by"": 0, ""set_id"": ""mpi://world_0"", ""op"": ""grow"", ""input_sets"": [], ""output_sets"": [""mpi://grow_0""]}"
finalize_psetopThe application has successfully called finalize_psetop."{""initialized_by"": 0, ""set_id"": ""mpi://world_0""}"
application_messageSome message from the application."{""message"" : ""LibPFASST started""}"
application_customSome custom data from the application.arbitrary, but valid JSON

Rationale: This format can be easily parsed by most programs like Excel, Python Pandas, etc. as it is a CSV format. CSV also allows simple merging (by concatenation) of log files and also allows the logger to log action by action compared to a more complex format like pure JSON. Allowing blank lines allows to add some visual separation of phases of the application and comments allow to manually add more context to specific events.

Furthermore, here are some basic rules for the contents of a log files:

The following code snippet can be used to read the log file (log_file_path) in Python Pandas:

import pandas as pd
import json

# define columns
columns = ['unixtimestamp', 'job_id', 'event', 'event_data']

# read csv file, ignore comments
df = pd.read_csv(log_file_path, names=columns, comment='#')

# remove empty and invalid lines
df.dropna(how="all", inplace=True)

# parse json
df['event_data'] = self.df['event_data'].apply(json.loads)

An example log file can be found here. A useful class for parsing these logfiles can be found here.

Requirements

The visualization is based on Manim, a visualization library for mathematical concepts.

Make sure to follow the installation instrucions to install Manim on your operating system.

Usage

Clone the DynVis repo:

git clone https://github.com/boi4/dynprocs_visualize.git && cd dynprocs_visualize

Run the dynvis.py script with the path to the log file:

python3 ./dynvis.py path/to/log/file

This will create a rendered video at media/videos/480p15/VisualizeDynProcs.mp4.

There exist some command line flags to tweak the behavior of DynVis:

usage: dynvis.py [-h] [--quality {low_quality,medium_quality,high_quality}] [--preview] [--round-to ROUND_TO] logfile

positional arguments:
  logfile

options:
  -h, --help            show this help message and exit
  --quality {low_quality,medium_quality,high_quality}, -q {low_quality,medium_quality,high_quality}
  --preview, -p
  --round-to ROUND_TO, -r ROUND_TO
                        On how many 10^r miliseconds to round the time to when aligning events
  --save_last_frame, -s
                        Save last frame as a picture

Example Output


Animated visualization of the example log file.
Final result of the visualization


MPI modifier scripts

I also created some bash scripts that can be simply prepended to the actual command ran by MPI. These bash scripts usually modify the output of each rank and can be helpful for debugging. They also work with dynamic Open MPI.

For example instead of running:

mpirun -np 8 ./main.exe probin.nml

You run

mpirun -np 8 ./color_rank.sh ./main.exe probin.nml

to color the output of each process differently.

The scripts can also be chained together:

mpirun -np 8 ./color_rank.sh ./prepend_rank.sh ./main.exe probin.nml
ScriptDescription
color_rank.shColors the output of each process based on its $PMIX_RANK.
env_wrapper.shPrints all environment variables of each process at the beginning.
ltrace_run.shUses ltrace to capture pset operation related MPI calls. Cannot be combined with GDB.
prepend_rank.shPrepends $PMIX_RANK to each line of the processes.
prepend_spacing.shAdds some amount of spacing based on $PMIX_RANK to each line of the processes. When making the terminal font very small, this can visualize the outputs of different ranks next to each other.

Note: You might need to chmod +x the scripts after downloading them to make them executable

TUM Interdisciplinary Project, Jan Fecht, 2023. pictures & videos: CC BY 3.0, code snippets: MIT