\documentclass[10pt]{article} \usepackage[margin=3cm]{geometry} \usepackage{amsmath} \usepackage{amsfonts} \usepackage{eurosym} \usepackage{ucs} \usepackage[utf8x]{inputenc} \usepackage{amssymb} \usepackage{graphicx} \usepackage{color} \pagestyle{plain} \usepackage{hyperref} \begin{document} \title{Mont-Blanc Prototype} \author{ } \date{2024-04-24} \maketitle The Mont-Blanc prototype is located at the BSC facilities. \section{Cluster information} \subsection{Setup} Use the command ''sinfo'' in order to check node availability. \begin{itemize} \begin{itemize} \item {\bf Login nodes}: ARM-based machines, provides access to the cluster. \begin{itemize} \item Ubuntu 14.04, Kernel 3.11.0 \item 15 login nodes \\ \%\%*\%\% User login to one of them in a random way when connecting to the cluster \end{itemize} \item {\bf Blade}: The cluster has 63 blades and each of them contains 15 SDB nodes. \begin{itemize} \item The blades are interconnected using a 10GbE network. \end{itemize} \item {\bf Nodes}: you can find more information about the SDB [[:sdb|here]]. \begin{itemize} \item Kernel 3.11.0-3.11.0-bsc_opencl+ \end{itemize} \end{itemize} \end{itemize} \subsection{Storage} Each node has {\bf 15G}{\bf B} at root partition, ''/home'' and ''/apps'' folders are mounted over the network (Lustre) from the file servers and are shared among all the nodes. \subsection{Power monitoring} \subsubsection{Getting the energy-to-solution of your job} Retrieving the energy-to-solution is straightforward since SLURM keeps track of it. You can query the SLURM accounting database using the {\bf sacct} command. An example output of sacct looks like this: \begin{verbatim} user@mb-login-12:~$ sacct -o JobID,ConsumedEnergy JobID ConsumedEnergy ------------ -------------- 74427 74427.batch 0 74427.0 4.29M 74438 74438.batch 0 74438.0 160 \end{verbatim} You can see that job 74438 consumed 160 Joules. Please be aware that if an error occurs during the measurement, you will see 4.29M (like in job 74427). We know that this is not very useful but this is a combination of some weird internals and a bug in SLURM. \subsubsection{Getting a full power trace} To retrieve a power trace for your application, you need to know the nodes on which your application was executed as well as the application's start and end times. One of the ways to achieve this, is to augment your job script as outlined below. \paragraph{Example Job Script} \begin{verbatim} # Example SLURM job script for power monitoring # based on a standard hybrid MPI+OpenMP job # #SBATCH --partition=mb #SBATCH --ntasks=2 #SBATCH --cpus-per-task=2 #SBATCH --out=omp-%j.out #SBATCH --err=omp-%j.err export OMP_NUM_THREADS=2 # Print the node list echo "Running on nodes: `scontrol show hostname $SLURM_JOB_NODELIST`" # Print the start time echo -n "Execution starts at: " date +%s # Run my application srun my_application # Print the end time echo -n "Execution stops at: " date +%s \end{verbatim} \paragraph{Retrieve the Power Trace} To retrieve the power trace in the form of a comma separated value file (CSV), you need the {\bf dcdbquery} tool. It becomes available with: \begin{verbatim} module load power_monitor \end{verbatim} The dcdbquery tool has the following syntax: \begin{verbatim} dcdbquery [-r] [-l] [-h ] [ ...] where - the name of the database server. Use mb.mont.blanc for this. - the name of one or more sensors (see below) - start of time series - end of time series \end{verbatim} Sensor names comprise of the node name and the type of sensor (PWR stands for power consumption). For example, if your job runs on node {\bf mb-237}, the name of the associated power sensor is {\bf mb-237-PWR}. Start and End times can be supplied in two formats: \begin{itemize} \begin{itemize} \item Human readable: Supply them as {\bf 'yyyy-mm-dd hh:mm:ss'} (with quotes), e.g.{\bf '2015-04-16 15:38:29'} \item Unix epoch: corresponding to the output of {\bf 'date +\%s'} \end{itemize} \end{itemize} {\bf By default, times are interpreted to be in UTC!} Use the {\bf '-l'} option to dcdbquery to switch interpretation of and as well as the generated output to your local timezone. If the {\bf -r} option is specified, the generated output contains the raw internal timestamps (nanoseconds since UNIX Epoch) instead of the human readable ISO format. Example: \begin{verbatim} dcdbquery -h mb.mont.blanc -r mb-915-PWR 1429178521 1429178527 Sensor,Time,Value mb-915-PWR,1429178521580000000,8458 mb-915-PWR,1429178522600000000,9462 mb-915-PWR,1429178523620000000,8994 mb-915-PWR,1429178525650000000,7475 mb-915-PWR,1429178526670000000,7472 \end{verbatim} \paragraph{Creating nice plots} The CSV output of dcdbquery can be interpreted by many popular applications (Excel, OpenOffice Calc, etc.). The following script uses some bash scripting fun to generate a nice plot of your appliaction's power trace based on the CSV output of dcdbquery (please run dcdbquery with the -r option!): \begin{verbatim} if [ "$#" -lt "1" ]; then echo "Usage: plot.sh [outfile.png]" exit fi # Check that file exists if [ ! -e $1 ]; then echo "File $1 does not exist." exit fi # Check that tmp doesn't exists if [ -e tmp ]; then echo "tmp directory already exists. Aborting." exit fi # Get length of file L=`cat $1 | wc -l` L=$(($L-1)) # Check for the contained sensor names S=`tail -n $L $1 | awk -F "," '{printf("%s\n", $1)}' | uniq` echo "Found sensors:" echo "$S" # Create a file for each sensor mkdir -p tmp while read -r i; do echo "Creating tmp/$i".dat grep "$i" $1 | sort > "tmp/$i".dat done <<< "$S" # Find the smallest time stamp U="2999999999999999999" while read -r i; do T=`head -n1 tmp/${i}.dat | awk -F "," '{print $2}'` echo "tmp/${i}.dat starts at: $T" if [ "$T" -lt "$U" ]; then U="$T" fi done <<< "$S" echo "Plot starts at: $U" # Write new data files with updated time stamps while read -r i; do cat tmp/${i}.dat | awk -F "," '{printf("%s,%f,%s\n",$1,($2-'$U')/1000000000,$3)}' > tmp/${i}.dat2 done <<< "$S" # Generate Plot File (general stuff) cat > tmp/plot.gplt << EOF # GNUPlot file for visualizing dcdbqueries set title 'MontBlanc Power Consumption' set xlabel "Time (sec)" set ylabel "Power (mW)" set border linewidth 1.5 set datafile separator "," EOF # Generate Plot File (output config) if [ "$#" -eq "2" ]; then cat>> tmp/plot.gplt << EOF set terminal png set output '$2' EOF fi # Generate Plot File (plot data) cat>> tmp/plot.gplt << EOF plot \\ EOF while read -r i; do echo "'tmp/${i}.dat2' using 2:3 title '$i' with lines, \\">> tmp/plot.gplt done <<< "$S" cat>> tmp/plot.gplt << EOF EOF # Generate Plot File (wait if interactive) if [ "$#" -eq "1" ]; then cat>> tmp/plot.gplt << EOF pause -1 EOF fi # Show Plot gnuplot tmp/plot.gplt # Delete tmp dir rm -r tmp \end{verbatim} \section{Login information} \subsection{Account} Please, do not share your account with other people. If several users from your institution require accessing the ARM clusters, please ask for one account for each user. \subsection{Login} You can access the prototype cluster using ssh: \begin{itemize} \begin{itemize} \item Log into the Mont-Blanc cluster login node with your user: \begin{itemize} \item {\bf ssh user@mb.bsc.es} \end{itemize} \end{itemize} \end{itemize} At this point, you would be at one of the login nodes (ARM-based machine). From here you could compile your applications (applications are found at /apps) or execute your application using the job scheduler (SLURM). \subsection{Change Password} \section{Software available} \subsection{Location} \begin{itemize} \begin{itemize} \item All the software is located at ''/apps'' folder. \end{itemize} \end{itemize} \subsection{Software} \begin{itemize} \begin{itemize} \item GNU Bison \item CLooG \item FLEX \item GMP Library \item ISL \item libunwind \item GNU MPC \item GNU MPFR \item PAPI \item GNU Compiler Suite \item Environment modules \item Boost \end{itemize} \end{itemize} \begin{itemize} \begin{itemize} \item Runtime \begin{itemize} \item MPICH \item OpenMPI \item OpenCL Full Profile \item OmpSs (stable and development) \item Open-MX \end{itemize} \end{itemize} \end{itemize} \begin{itemize} \begin{itemize} \item Scientific libraries \begin{itemize} \item FFTW \item HDF5 \item ATLAS \item clBLAS \end{itemize} \end{itemize} \end{itemize} \begin{itemize} \begin{itemize} \item Development Tools \begin{itemize} \item Extrae \item Allinea DDT \item Scalasca \item LTTNG \end{itemize} \end{itemize} \end{itemize} \begin{itemize} \begin{itemize} \item Frameworks \begin{itemize} \item GASNet \end{itemize} \end{itemize} \end{itemize} \begin{itemize} \begin{itemize} \item SLURM script example for an MPI application: \end{itemize} \end{itemize} \begin{verbatim} #!/bin/sh #SBATCH --ntasks=$NTASKS #SBATCH --cpus-per-task=$NCPU_TASK #SBATCH --partition=$PARTITION #SBATCH --job-name=$JOB_NAME #SBATCH --error=err/$JOB_NAME-%j.err #SBATCH --output=out/$JOB_NAME-%j.out #SBATCH --workdir=/path/to/binaries srun ./$PROG \end{verbatim} {\bf NOTE:} to run an OpenCL application you must add {\bf \#SBATCH –gres=gpu} to your jobscript. \subsection{Environment Modules} The Environment Modules package provides for the dynamic modification of a user's environment via modulefiles. Each modulefile contains the information needed to configure the shell for an application. Once the Modules package is initialized, the environment can be modified on a per-module basis using the module command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc. modulefiles may be shared by many users on a system and users may have their own collection to supplement or replace the shared modulefiles. Modules can be loaded and unloaded dynamically and atomically, in an clean fashion. Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications. A complete list of available modules can be obtained using the command {\bf module avail} at one of the compute nodes, obtaining an output as the folllowing. \begin{verbatim} druiz@mb-login-1:~$ module avail ------------------------------------------------ /apps/modules/3.2.10/Modules/default/modulefiles/compilers ------------------------------------------------- gcc/4.8.2 mpich/3.1.4_omx openmpi/1.6.4_experimental_tracing_debug gcc/4.9.0 ompss/14.10 openmpi/1.6.4_experimental_tracing_gunwind gcc/4.9.1 ompss/15.02 openmpi/1.8.3 gcc/4.9.2 ompss/15.04(default) openmpi/1.8.3_omxTesting gcc/5.1.0 ompss/git-gcc-4.8.4 openmpi/1.8.7 gcc/5.2.0(default) openmpi/1.10.0 openmpi/1.8.8(default) mpich/3.1.3 openmpi/1.6.4 openmpi/1.8.8_omx mpich/3.1.4(default) openmpi/1.6.4_experimental_tracing -------------------------------------------------- /apps/modules/3.2.10/Modules/default/modulefiles/tools --------------------------------------------------- allineaDDT/latest(default) extrae/3.2.0 papi/5.4.1(default) scalasca/2.2.1(default) extrae/2.5.0 lttng/2.6.2(default) perf/3.11.0-bsc_opencl+ scalasca/2.2.1_externCube extrae/3.0.1 papi/5.3.0 perf/3.11.0-bsc_opencl_dvfs+(default) scons/2.3.6(default) extrae/3.1.0(default) papi/5.4.0 power_monitor/power_monitor(default) ------------------------------------------------ /apps/modules/3.2.10/Modules/default/modulefiles/libraries ------------------------------------------------- atlas/3.11.27 clBLAS/latest(default) fortranCL/0.1alpha4(default) liburcu/0.8.7(default) scalapack/2.0.2(default) atlas/3.11.31_lapack(default) clFFT/latest(default) hdf5/1.8.13(default) opencl/1.1.0(default) vtk/6.1.0(default) boost/1.56.0 fftw/2.1.5 hdf5/1.8.13_parallel opengl/2.4.0(default) boost/1.58.0(default) fftw/3.3.4(default) lapack/3.5.0(default) petsc/3.5.3(default) ------------------------------------------------ /apps/modules/3.2.10/Modules/default/modulefiles/frameworks ------------------------------------------------ GASNet/1.24.2(default) \end{verbatim} Note that several versions are provided for some of the modules, like gcc. The following examples will show you how to load, unload or obtain info about available modules. Several module can be specified to obtain or load/unload at once. \begin{verbatim} druiz@mb-login-1:~$ module help gcc/4.9.2 load gcc/4.9.2 (PATH, MANPATH, LD_LIBRARY_PATH) druiz@mb-login-1:~$ module list Currently Loaded Modulefiles: 1) atlas/3.11.31_lapack 5) hdf5/1.8.13 9) extrae/3.0.1 2) lapack/3.5.0 6) opencl/1.1.0 10) gcc/4.9.2 3) boost/1.56.0 7) papi/5.4.0 4) fftw/3.3.4 8) openmpi/1.8.3 \end{verbatim} \subsubsection{Loading modules at login} If you want to load some modules automatically when login in to mb-cluster you can add the following to your .bashrc script. \begin{verbatim} module load gcc openmpi \end{verbatim} \end{document}