|
[Sponsors] |
August 23, 2012, 09:42 |
MPI issue on multiple nodes
|
#1 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Dear All,
I'm setting up a mini cluster (4 nodes) to decrease the solution time with Openfoam. the OS is ubuntu 12.4LTS there is no scheduler (i know that this is not the best course, but...) i've NFS exported my home on the slaves and have passwordless ssh access. the same version of OF, paraview and thirdparties is installed on every machine via script. i've created the machinefile using the ip of the slaves: Code:
192.168.0.17 192.168.0.19 192.168.0.21 192.168.0.23 to test the config i meshed, and decomposed the motorbike tutorial, but when i lauch the command Code:
mpirun -np 4 --hostfile [my machinefile] simpleFoam -parallel > log Code:
-------------------------------------------------------------------------- mpirun was unable to launch the specified application as it could not find an executable: Executable: simpleFoam Node: 192.168.0.19 while attempting to start process rank 1. -------------------------------------------------------------------------- to my uninformed judgement it looks that it cannot find the simplefoam file, but, exporting the home to all the nodes, i shuld already have all the necessary informations in the .bashrc file. and i do: if i ssh to a node and check it, the last line shows Code:
source /opt/openfoam211/etc/bashrc thanks in advance |
|
August 23, 2012, 13:34 |
|
#2 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
quick update:
moving the "source /opt/openfoam211/etc/bashrc" as a first line solved the issue. now lauching the case works-kind of. tailing the log shuows tat it goes no furhter than the "create time" step, but top on the master node and the slaves shows the processes alive and running @ 100%. ideas? Last edited by sail; August 23, 2012 at 14:07. |
|
August 23, 2012, 15:49 |
|
#3 | |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Greetings Vieri,
Unfortunately I don't much time to help diagnose the issue, so I'll have to refer you to this quote from my signature link: Quote:
Bruno
__________________
|
||
August 24, 2012, 22:41 |
|
#4 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
still no success.
any hello world or other mpi program does work, so it is not a mpi or network issue. but the foam job get does not goes past the create time: Code:
foamJob -p -s simpleFoam Parallel processing using SYSTEMOPENMPI with 2 processors Executing: /usr/bin/mpirun -np 2 -hostfile machines /opt/openfoam211/bin/foamExec -prefix /opt simpleFoam -parallel | tee log /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2.1.1 | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : 2.1.1-221db2718bbb Exec : simpleFoam -parallel Date : Aug 25 2012 Time : 03:32:58 Host : "Milano1" PID : 25714 Case : /home/cfduser/OpenFOAM/cfduser-2.1.1/run/vieri/tutorials/tutorials/incompressible/simpleFoam/motorBike nProcs : 2 Slaves : 1 ( "Milano4.21614" ) Pstream initialized with: floatTransfer : 0 nProcsSimpleSum : 0 commsType : nonBlocking sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster allowSystemOperations : Disallowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time the case works well both in single and on multicore on a single machine, on both master and any of the slaves. any ideas? |
|
August 24, 2012, 22:52 |
|
#5 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
same thing using the application Test-parallel gives the same result.
|
|
August 25, 2012, 05:31 |
|
#6 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Hi Vieri,
Weekend is here, so it'll be easier to start helping you. OK, so you've got 4 machines that are unable to communicate with each other on your cluster. All of them have OpenFOAM installed in the same folder, or at least the folder is shared where the installation is. One of the reasons for this lock-in to occur is if there is more than one way for each machine to access any network. My guess is that the master node has at least 2 ethernet cards, one for outside world and another for cluster network. Therefore, to try and check if this is indeed the problem, you can log in into one of the cluster nodes and launch the Test-parallel application using only two of those nodes. If that works, try using 3 nodes, without the master. If that works, then the problem is indeed because the master has two cards and Open-MPI gets lost in the master while trying to search for other ways to access the nodes. To override this behaviour, check what ethernet interfaces you've got on the master node: Code:
ifconfig -a You should also confirm the interface names used in the slave nodes. Now, edit the foamJob script - or create your own copy of it - find this block of code and add the line in bold: Code:
# # locate mpirun # mpirun=`findExec mpirun` || usage "'mpirun' not found" mpiopts="-np $NPROCS" mpiopts="$mpiopts --mca btl_tcp_if_exclude lo,eth1" Hopefully this will do the trick. Best regards, Bruno
__________________
|
|
August 27, 2012, 20:33 |
|
#7 | |||||
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Quote:
First of all, thanks for taking your well deserved free-time to help me. Quote:
Quote:
Quote:
I've positively checked that all the used network are called eth0. Quote:
mpiopts="$mpiopts --mca btl_tcp_if_exclude lo,eth1" and mpiopts="$mpiopts --mca btl_tcp_if_exclude eth1" give the same behaviour: stuck at "create Time" phase. mpiopts="$mpiopts --mca btl_tcp_if_exclude lo,eth0" and mpiopts="$mpiopts --mca btl_tcp_if_exclude eth0" don't even get till there: as it is expected, we excluded the connection, so it doesn't even start; this is the output to where it is stuck: Code:
./foam2Job -p -s Test-parallel Parallel processing using SYSTEMOPENMPI with 2 processors Executing: /usr/bin/mpirun -np 2 --mca btl_tcp_if_exclude lo,eth0 -hostfile machines /opt/openfoam211/bin/foamExec -prefix /opt Test-parallel -parallel | tee log cfduser@192.168.0.23's password: /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2.1.1 | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : 2.1.1-221db2718bbb Exec : Test-parallel -parallel Date : Aug 28 2012 Time : 01:11:00 Host : "Milano2" PID : 22821 i spent some hours to do some testing, and now, just while i was repling to you, i found it! ifconfig higlighted the presence of another network infetface called usb0 (i have not idea what it is). adding it to the parameters to be exclueded in the new foamJob script made everything work. I guess openMpi is really greedy for tcp connections! i guess it wasn't advancing because it was waiting for an answer on this misterious usb0 device. thanks again, you are relly one of the columns of this forum and the time you spend helping users and (wannabe) sysadmin is really appreciated! words fail me to express my gratitude. best regards. |
||||||
August 28, 2012, 10:50 |
|
#8 | ||
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Hi Vieri,
Quote:
Basically, if you have 2 cards, both turned on and in the same subnet (same machine could have 10.0.0.1 and 10.0.0.2, one for each card); but for example, when you want to ping another machine, you need to specify an interface: Code:
ping -I eth1 10.0.0.3 Quote:
Best regards, Bruno
__________________
|
|||
September 7, 2012, 06:20 |
|
#9 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Up!
Unfortunately another issue has arisen. if i launch a job using all the 16 cores of my 4- machines cluster it gets stuck. i see on the master node a cpu utilization of 25% in userspace and 75% in systemspace, while the slaves are 50% waiting for IO. this behaviour happens even if the master node has just one processor occupied by the job. it looks to me, but i might be wrong, that the master have issues serving the nfs shared directory to all the nodes because if i first copy the case data locally on the slaves, using the informations provided here: http://www.cfd-online.com/Forums/blo...h-process.html it works flawlessi and is really fast. i then tried to stresstest the nfs executing the following script on a slave but it works flawlessy. Code:
time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile1 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile2 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile3 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile4 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile5 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile6 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile7 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile8 bs=16k count=10000 & time dd if=/dev/zero of=~/OpenFOAM/cfduser-2.1.1/testfile9 bs=16k count=10000 on the other hand when i lauch the job i see just 2 nfsd at 1% cpu on the master. running a 4-cores job using one core each machine works well, running a 4-cores job using the master, a slave with 2 cores and a slave with 1 core is slower. running a 5-cores job with a master, and 2 2-cores slaves is even slower/gets stuck.: master - top: Code:
Tasks: 213 total, 2 running, 206 sleeping, 0 stopped, 5 zombie Cpu(s): 5.0%us, 20.8%sy, 0.0%ni, 73.9%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 16423888k total, 15999036k used, 424852k free, 52k buffers Swap: 15624996k total, 1469876k used, 14155120k free, 213596k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10123 cfduser 20 0 322m 156m 15m R 100 1.0 4:32.87 pimpleDyMFoam 2200 root 20 0 0 0 0 D 1 0.0 0:10.33 nfsd 2202 root 20 0 0 0 0 D 1 0.0 0:26.47 nfsd 9515 root 20 0 0 0 0 S 1 0.0 0:01.24 kworker/0:2 10302 root 20 0 0 0 0 S 1 0.0 0:00.63 kworker/0:3 10301 root 20 0 0 0 0 S 0 0.0 0:00.34 kworker/u:1 Code:
Tasks: 145 total, 1 running, 141 sleeping, 0 stopped, 3 zombie Cpu(s): 0.0%us, 0.3%sy, 0.0%ni, 49.7%id, 50.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16423888k total, 12310280k used, 4113608k free, 580k buffers Swap: 15624996k total, 3100k used, 15621896k free, 11622444k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1695 root 20 0 0 0 0 S 1 0.0 0:01.28 kworker/1:0 4124 root 20 0 0 0 0 D 1 0.0 0:00.56 192.168.0.17-ma 911 cfduser 20 0 17340 1332 968 R 0 0.0 0:00.51 top 4658 lightdm 20 0 744m 18m 12m S 0 0.1 86:52.63 unity-greeter 1 root 20 0 24592 2364 1300 S 0 0.0 0:01.56 init 2 root 20 0 0 0 0 S 0 0.0 0:00.56 kthreadd any ideas on how to fix it? up to now all the comm on the network happens on gb ethernet. tests shows that the trasfer rate is reasonable (60MB/s). i have another ethernet card in every machine, up to now unused. shuld i activate it, assign a different set of IPs and use one for nfs traffic and another for MPI comm? do you belive this might be the problem? any other options or flags I'm missing? thanks in advance best regards |
|
September 8, 2012, 16:50 |
|
#10 | |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Hi Vieri,
NFS locking up... sounds familiar, but I can't remember if I ever found out the fix for it... But before I forget, here's a link with some hints on improving NFS performance: http://scotgrid.blogspot.pt/2009/11/...guide-for.html Now, let's try diagnosing this another way (hint: jump to #7, then start from the top if it didn't work ):
As for assigning traffic via different cards: AFAIK, it all depends on the kernel (configurations) you're using. From my experience with openSUSE and it's default Linux kernel, I'm never able to properly use two networks for connecting a group of machines, because the network package manager always sends packages through the shortest path, namely use the first "eth*" it can use for sending packages between two machines, even if it means automagically pairing IP addresses on the same NIC. Nonetheless, you might be luckier than I've been so far, if you follow these instructions: http://www.open-mpi.org/community/li...9/12/11544.php Best regards, Bruno
__________________
|
||
September 13, 2012, 03:58 |
|
#11 |
New Member
Giuliano Lotta
Join Date: May 2012
Posts: 12
Rep Power: 14 |
Hi Wildkat !
I'm the sysamdin of the cluster where the program is running. First of all thanks a lot for your huge help. The cluster is formed by 4 ibm systemx xeon ws. The os is Ubuntu 12.04 64 bit.With nfs3+Openfoam2.1.1+OpenMPI1.4.3 Each ws has 2 eth cards, but card eth1 is disconnected, and the MPI foam2Job disable the eth1 and usb0 port I've had to downgrade the nfs4 to nfs3, because with nfs4 the motorbike testcase locked. ---------------------------- nfs3 machine configuration: SERVER - nfs /etc/exports : 192.168.0.19(rw,sync,root_squash,no_subtree_check) 192.168.0.21(rw,sync,root_squash,no_subtree_check) 192.168.0.23(rw,sync,root_squash,no_subtree_check) CLIENT - nfs 3 version /etc/fstab 192.168.0.17:/home/cfduser/OpenFOAM /home/cfduser/OpenFOAM/ nfs _netdev,nfsvers=3,proto=tcp,noac,auto 0 0 ----------------------------------------- Unfortunately, _our_ test case is still not running (the tutorials on MPI do run); I give a reply to the topic you proposed, to see what we could try. 1) incompressible/pisoFoam/les/motorBike/motorBike/ works like a charme. I attach a screen shot of the nmon of the four ws running. After success in motorbike mpi run, we thought that the nfs problem was over.... nope :-( http://ubuntuone.com/6VdREYUP3lU48ii1P6ndvO 2) didn't change writeInterval in controlDict (of the motorbike tutorial), because it worked out of the box 4) Open mpi come in version 1.4.3 together with OpenFoam with the SGI package. In Ubuntu 12.04 MPI ver 1.5 is available as separate package, but cause disinstallation of OpenFoam :-)). Anyway from what I read in your link, ver 1.4.3 should be safer 5) our testcase is some GB in size, so I fear it should overlap the cache 6) /opt/openfoam211/etc/controlDict is the same you posted 7) as motorbike is working flawlessly, I fear that is not (mainly) an nfs configuration problem. Anyway it happens that, if we run our testcase letting it fail, the master keeps two nfsd process alive (with some nwtk activities going on... why?? ) and running then motorbike, lead to failure (motorbike stuked as our case !) I attach two screenshots more (our case failing/locked, and the 2 nfsd zombie after killing the foam2Job with ctrl-c) http://ubuntuone.com/11t2ZT9VIENlM1etMHbpU1 http://ubuntuone.com/6Y1Kt32czcNMYtRSrKveEW As can be seen by the screen shot our test case keeps node 3 and 4 in waiting. Note 2 is working (?) Maybe related in some way to the code, i.e in the order from which the write and read funtion are called, causing a deadlock ? Maybe the node2 keeping a file still open, causing nodes 3 and 4 locked out ? The problem arise from the fist seconds.... Seeing the screenshots, which other test you would consider ? Which debug code we could insert in the code to narrow the point where the problem is generated ? Thanks again for your huge help. Last edited by Giuliano69; September 14, 2012 at 10:46. |
|
September 14, 2012, 11:18 |
|
#12 |
New Member
Giuliano Lotta
Join Date: May 2012
Posts: 12
Rep Power: 14 |
Update:
Result n° 1 at a fist debug, the problem seems to arise form the AMI mesh that has been used. The code get stucked at #include "createDynamicFvMesh.H" that is: Code:
debugPimpleDyMFoam.C int main(int argc, char *argv[]) { Info << "-- STARTING ALL"<< endl; #include "setRootCase.H" Info << "-- setRoot done, before createTime"<< endl; #include "createTime.H" Info << "-- createTime done, before createDynmicMesh" << endl; #include "createDynamicFvMesh.H" Info << "-- creteDynMesh done, before initCont" << endl; #include "initContinuityErrs.H" Info << "-- Init cont done before createFields" << endl; #include "createFields.H" Info << "-- createFields done, before readTimeControls" << endl; #include "readTimeControls.H" Info << "-- redTime done before pimpleControl" << endl; pimpleControl pimple(mesh); // * Code:
createDynamicFvMesh.H 00001 Info<< "Create mesh for time = " 00002 << runTime.timeName() << nl << endl; 00003 00004 autoPtr<dynamicFvMesh> meshPtr 00005 ( 00006 dynamicFvMesh::New 00007 ( 00008 IOobject 00009 ( 00010 dynamicFvMesh::defaultRegion, 00011 runTime.timeName(), 00012 runTime, 00013 IOobject::MUST_READ 00014 ) 00015 ) 00016 ); 00017 00018 dynamicFvMesh& mesh = meshPtr() After this debug, the ideas comes to try a vanilla example with AMI mesh like "mixerVesselAMI2D", on a parallel cluster. motorbike works on parallel, but mixerVesselAMI2D -that use AMI mesh- DOESN'T, and gets stacked in the same way of our case. Could it be a problem of the AMI mesh when in a parallel cluster ? |
|
September 15, 2012, 09:11 |
|
#13 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Greetings Giuliano,
OK, I've read through your posts here and will follow up at your other thread: http://www.cfd-online.com/Forums/ope...ed-anyone.html Best regards, Bruno
__________________
|
|
October 12, 2012, 12:24 |
|
#14 |
New Member
Giuliano Lotta
Join Date: May 2012
Posts: 12
Rep Power: 14 |
Thanks wildkat for your kind help.
After some test&debug, it was clear that ONLY when node n°4 was working in MPI, we had problems with the MPI lock. Although the installation was done from with a bash script, the only solution was to try a complete format&reinstall of node n° 4. after that, everything worked. Strange but... |
|
October 12, 2012, 12:29 |
|
#15 |
New Member
Giuliano Lotta
Join Date: May 2012
Posts: 12
Rep Power: 14 |
I would like also to share the following experience:
under ubuntu 12.04 64bit, BOTH the following nfs configuration (CLIENT side) was found to be fully working with MPI (/etc/fstab configuration) (I mean that it woks EITHER as nfs 3 OR as nfs 4) 192.168.0.17:/home/cfduser/OpenFOAM /home/cfduser/OpenFOAM/ nfs4 _netdev,auto 0 0 #192.168.0.17:/home/cfduser/OpenFOAM /home/cfduser/OpenFOAM/ nfs _netdev,nfsvers=3,proto=tcp,noac,auto 0 0 on the SERVER side, this is the etc/exports configuration /home/cfduser/OpenFOAM 192.168.0.19(rw,sync,root_squash,no_subtree_check) 192.168.0.21(rw,sync,root_squash,no_subtree_check) 192.168.0.23(rw,sync,root_squash,no_subtree_check) At the PRESENT TIME, the used configuration is nfs ver 4 In case could help someone.... |
|
October 12, 2012, 17:32 |
|
#16 | |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Hi Giuliano,
Quote:
And many thanks for sharing the info on how to configure NFS on Ubuntu. Personally I rarely am able to configure Ubuntu to use NFS... but thank goodness that openSUSE exists otherwise I would be missing more hair on my head... Best regards, Bruno
__________________
|
||
August 27, 2013, 13:15 |
Deadlcok
|
#17 |
Senior Member
Ehsan
Join Date: Mar 2009
Posts: 112
Rep Power: 17 |
Hello
We are running interPhaseChangeFoam in parallel using 24 nodes. The run starts fine and go for some times but afterwards, one of our systems does not contribute in the communications process and the run encounters the deadlock. Could you please help us in this regards? Sincerely, Ehsan |
|
August 27, 2013, 18:00 |
|
#18 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Greetings Ehsan,
Did you try following the instructions present on this thread? Best regards, Bruno
__________________
|
|
August 27, 2013, 22:59 |
Parallel run problem
|
#19 |
Senior Member
Ehsan
Join Date: Mar 2009
Posts: 112
Rep Power: 17 |
Hello Burno
the problem is that whether the stop of the run is related to the system set-ups that discussed in this page or it is something with the interPhaseChangeFoam solver being run in parallel? In fact, one of our system stop contributing after some runs, say after 1000 or 2000 time steps, if we stop the run and continue again, the run goes ahead until again encountering the same problem. However, we used this system for parallel run of other solvers without problem and without this stops. So, is it possible that problem arise from interPhaseChangeFoam setups in fvSolution or elsewhere? Regards |
|
August 28, 2013, 07:24 |
Parallel runs
|
#20 |
Senior Member
Ehsan
Join Date: Mar 2009
Posts: 112
Rep Power: 17 |
Hello
We detected that the problem is that one system goes out of connection from the network, i.e., once we ping it, it won't reply. It is odd that at the start, it goes fine but after some iterations it stop working in the network. Would you please help me in this regards? Thanks |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
how to set periodic boundary conditions | Ganesh | FLUENT | 15 | November 18, 2020 07:09 |
Issue with OpenMPI-1.5.3 while running parallel jobs on multiple nodes | LargeEddy | OpenFOAM | 1 | March 7, 2012 18:05 |
Issue with running in parallel on multiple nodes | daveatstyacht | OpenFOAM | 7 | August 31, 2010 18:16 |
Error using LaunderGibsonRSTM on SGI ALTIX 4700 | jaswi | OpenFOAM | 2 | April 29, 2008 11:54 |
CFX4.3 -build analysis form | Chie Min | CFX | 5 | July 13, 2001 00:19 |