DFSSSP - A Deadlock-Free Fast Routing Algorithm for InfiniBand/OpenSM

The DFSSSP [1] algorithm is a deadlock-free version of the SSSP algorithm [2], which enables balanced routing along shortest paths. SSSP delivers a significantly higher effective bisection bandwidth [3] than other routing algorithms implemented in OpenSM [1,2]. Download patched OpenSM:

See README file in package!

Installing the patched OpenSM
tar xzf management.tar.gz
cd management
export CONFIG_OPTS="--prefix=${HOME}/ofed"
export CONFIG_OPTS="${CONFIG_OPTS} LDFLAGS=-L${HOME}/ofed/lib"
export CONFIG_OPTS="${CONFIG_OPTS} CPPFLAGS=-I${HOME}/ofed/include"

cd libibumad
./autogen.sh && ./configure ${CONFIG_OPTS}
make
make install
cd ..

cd libibmad
./autogen.sh && ./configure ${CONFIG_OPTS}
make
make install
cd ..

cd opensm
./autogen.sh && ./configure ${CONFIG_OPTS}
make
make install

Running the patched OpenSM with SSSP (needs ROOT rights):
cd $HOME/ofed/sbin
./opensm -R sssp

for additional parameters look at ./opensm --help

Run deadlock-free SSSP routing algorithm (needs ROOT rights):
cd $HOME/ofed/sbin
./opensm -R dfsssp

Run DFSSSP with modified QoS file (to configure SL/VL load):
cd $HOME/ofed/sbin
./opensm -R dfsssp --qos_policy_file ${HOME}/ofed/qos.conf

Possible content of qos.conf file:
qos_max_vls 8
qos_high_limit 0

qos_ca_vlarb_high 0:64,1:64,2:64,3:64,4:64,5:64,6:64,7:64
qos_ca_vlarb_low 0:4,1:4,2:4,3:4,4:4,5:4,6:4,7:4
qos_ca_sl2vl 0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7

qos_swe_vlarb_high 0:64,1:64,2:64,3:64,4:64,5:64,6:64,7:64
qos_swe_vlarb_low 0:4,1:4,2:4,3:4,4:4,5:4,6:4,7:4
qos_swe_sl2vl 0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7

for explanation see management/opensm/doc/QoS_management_in_OpenSM.txt

DFSSSP was developed by Jens Domke at the ZIH at the Technische Universität Dresden and the scientific work was advised by Torsten Hoefler.

References

References

IPDPS'11
[1] J. Domke, T. Hoefler, W. Nagel:
 Deadlock-Free Oblivious Routing for Arbitrary Topologies In Proceedings of the 25th IEEE International Parallel \& Distributed Processing Symposium (IPDPS), presented in Anchorage, AL, USA, pages 613--624, IEEE Computer Society, ISBN: 0-7695-4385-7, May 2011, (acceptance rate: 19.6%, 112/571)
HotI'09
[2] T. Hoefler, T. Schneider, A. Lumsdaine:
 Optimized Routing for Large-Scale InfiniBand Networks In 17th Annual IEEE Symposium on High Performance Interconnects (HOTI 2009), presented in New York, NY, Aug. 2009,
TUM'08
[3] T. Hoefler:
 Towards coordinated optimization of computation and communication in parallel applications (Presentation) Fakultaet fuer Informatik, Universität Münster. presented in Muenster, Germany, Jun. 2008,