\documentclass[10pt]{article}
\usepackage[margin=3cm]{geometry}
\usepackage{amsmath}
\usepackage{amsfonts}
\usepackage{eurosym}
\usepackage{ucs}
\usepackage[utf8x]{inputenc}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{color}
\pagestyle{plain}

\usepackage{hyperref}

\begin{document}


\title{Tracing manual for Mont-Blanc prototype}

\author{ }
\date{2024-04-19}

\maketitle

{\bf NOTE:} This guide explains how to generate a Paraver trace with Extrae tracing tool by using the LD_PRELOAD method. Another methods are available, but not covered here.


\section{MPI{+PROG_MOD} applications}


\subsection{Prepare your binary}

{\bf This step is only needed for C/C++ applications}

Due to some bugs at the libunwind library used by Extrae, the first thing to do is to recompile our application by using the following flags. If not, our MPI application is more likely to incur into a segmentation fault during the execution.

\begin{verbatim}
-funwind-tables -g
\end{verbatim}


\section{Prepare your job script}

Now we need to modify our job script to specify that we want to generate a trace with Extrae. For this, first thing to do is to load the module of the MPI implementation we want to use. Once done, load the Extrae module.

\begin{verbatim}
load openmpi/1.10.0 (PATH, MANPATH, LD_LIBRARY_PATH)
druiz@mb-login-12:~$ module load extrae
load extrae/3.2.1 (PATH, LD_LIBRARY_PATH, C_INCLUDE_PATH, EXTRAE_HOME)
\end{verbatim}

At this point we should have set the environment variable ''$EXTRAE_HOME'' with the path of the proper Extrae installation to use.

Now we need to copy the needed files to the folder where our job script is located. This steps is dependent of the programming model we are using. Note that the exact folders can change depending on the programming model as well.

\begin{verbatim}
druiz@mb-login-12:~/job$ cp $EXTRAE_HOME/share/example/${PROGRAMMING_MODEL}/extrae.xml .
druiz@mb-login-12:~/job$ cp $EXTRAE_HOME/share/example/${PROGRAMMING_MODEL}/ld-preload/trace.sh .
\end{verbatim}

Where programming model can be one of the following:

\begin{itemize}
\begin{itemize}
\item MPI
\item MPI+OMP
\item MPI+OMPSS
\item MPI+OPENCL
\end{itemize}
\end{itemize}

Now edit ''trace.sh''  file. Please note that since the ''extrae.xml''  is located at the same folder we need to make sure that the path is correct for ''$EXTRAE_CONFIG_FILE''  variable.

\begin{verbatim}

export EXTRAE_HOME=/apps/extrae/3.2.1/openmpi/1.10.0
export EXTRAE_CONFIG_FILE=./extrae.xml

# Example for MPI only application, set only one
export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so # For C apps
#export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitracef.so # For Fortran apps

## Run the desired program
$*
\end{verbatim}

Regarding the ''extrae.xml''  file, we will not cover how to set the different options available. For this, one could check the commented examples provided at ''$EXTRAE_HOME/share/example/${PROGRAMMING_MODEL}''

The last thing to do now is to modify our job script. The only change needed is at the srun command, where we should add the execution of the ''trace.sh''  script.

\begin{verbatim}

#SBATCH --partition=mb
#SBATCH --ntasks=64
#SBATCH --cpus-per-task=1
#SBATCH --out=mpi-%j.out
#SBATCH --err=mpi-%j.err
#SBATCH --time=10:00

srun ./trace.sh ./mpi_binary
\end{verbatim}


\section{Submit your job}

No special effort is needed at this point, just submit your job as usual.

\begin{verbatim}
sbatch job.sh
\end{verbatim}

Once the job finishes the trace should be generated. The default is to generate it with the same name as the application binary and in the same folder from where the application was executed. The trace consists in 3 files with extensions ''.prv'', ''.row'' and ''.pcf''.


\section{Manual merge of the trace}

{\bf NOTE:} The merge process is done automatically by Extrae at the end of the execution and, almost every time, works fine. For those cases that not, disable the ''merge'' tag at the ''extrae.xml'' and follow the following steps.

Merging the trace can be done in a serial or a parallel way. We strongly suggest to execute the merger in parallel since the merging process can spent a lot of time for big traces.

The best idea is to put the parallel merge inside your job script. This way you will be able to use the same number of processors for the merging process. Anyhow, it can be done with a different number of nodes.

The following example shows how to merge the trace at the same job script used to generate it.

\begin{verbatim}
#!/bin/bash
 
#SBATCH --partition=mb
#SBATCH --ntasks=64
#SBATCH --cpus-per-task=1
#SBATCH --out=mpi-%j.out
#SBATCH --err=mpi-%j.err
#SBATCH --time=10:00
 
srun ./trace.sh ./mpi_binary

# Make sure the intermediate files are synced
sync
sleep 5s

# Merge the trace
srun mpimpi2prv -f TRACE.mpits -o ${TRACE_NAME}.prv
\end{verbatim}

\end{document}