![]() |
Peano
|
Distributed memory programming in Peano can be tricky, as we have a lot of different threads per rank and the ranks run pretty asynchronously.
Therefore, it is always a bad idea to use blocking or synchronising routines. Instead, you should use on routines implemented within tarch::parallel as they usually take the MPI+X stuff into account. Here are a few recommendations:
MPI_Test
. While you poll, invoke receiveDanglingMessages()
on the services to allow to free the queue from other messages such as load balancing requests which might be in your way.tarch::services::Service
) and equip it with a receiveDanglingMessages()
routine which probes the MPI queues. If a message of relevance is in the MPI queue, receive it. If you can't react directly, buffer it internally. This routine ensures that the MPI buffers can't overflow. Overflows force MPI to fall back from eager to rendeszouz communication which can lead to deadlocks.If a message consists of multiple parts, you might have to fuse it into one message to avoid that the first part of a message is received and queue polling (on another thread) then grabs the second part misinterpreting it for the first part of yet another message.
If you reduce data betwen the ranks, I recommend that you use the tarch wrappers. They have two advantages: They automatically provide deadlock tracking, and they poll the MPI queues, i.e. if the code stalls as too many other messages for other purposes (global load balancing, plotting, ...) drop in, and remove these messages from the buffers such that MPI can make progress. The usage is detailed in tarch::mpi::Rank::allReduce(), but the code usually resembles:
I also offer a boolean semaphore accross all MPI ranks. This one is quite expensive, as it has to lock both all threads and all ranks and thus send quite some messages forth and back.
The usage is similar to the shared memory lock, but you have to give the semaphore a unique name (you might construct a string using the __LINE__
and __FILE__
macro). It is the name that uses to ensure that the semaphore is unique accross all ranks. In this way, the semaphore is similar to an OpenMP named critical section. I recommend to avoid MPI barrier whenever possible. Details can be found in tarch::mpi::Rank::barrier()
.