|
[Sponsors] |
Running damBreak with OpenFOAM 15 and Open MPI on mixed up CPU Systems |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
November 19, 2008, 05:58 |
Hi everybody,
I try to run da
|
#1 |
New Member
Dennis Lange
Join Date: Mar 2009
Posts: 8
Rep Power: 17 |
Hi everybody,
I try to run damBreakFine on four single CPU Systems + one dual CPU System via Open MPI The machines file looks like this: node1 node2 node3 node4 node5 I start the processes with: mpirun --hostfile damBreakFine/system/machines -np 5 interFoam -case damBreakFine -parallel >& damBreakFine/log & If I mpirun the single CPU Systems everything works fine. If I mpirun the dual CPU System it works fine. But if I mix them up, mpirun spawn a thread for every CPU and put them to 100% usage, but did not get any further. The log file shows only: create mesh for time = 0 There is a need to kill one interFoam process to bring the CPUs back to normal usage rate. Any suggestion? |
|
November 19, 2008, 09:26 |
Hello again,
I also compile
|
#2 |
New Member
Dennis Lange
Join Date: Mar 2009
Posts: 8
Rep Power: 17 |
Hello again,
I also compiled a little MPI "Hello World" Test and it works fine on 6 CPUs: ---------------------- #include <mpi.h> #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]) { int rank, size, node; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &node); char buf[1024]; gethostname(buf,1024); printf (buf); printf(" says hello world with nid %d \n",node); MPI_Finalize(); return 0; } -------------------- > mpicc hello.c -o hello This time the machines file looks like this: node1 node2 node3 node4 node5 cpu=2 > mpirun --hostfile machines -np 10 ./hello node1 says hello world with nid 0 node2 says hello world with nid 1 node3 says hello world with nid 2 node4 says hello world with nid 3 node5 says hello world with nid 4 node2 says hello world with nid 7 node3 says hello world with nid 8 node4 says hello world with nid 9 node5 says hello world with nid 5 node1 says hello world with nid 6 Could it be that OpenFOAM 1.5 parallel run is only supported up to 4 processors? |
|
November 19, 2008, 10:11 |
I am running turbFoam right no
|
#3 |
Member
Dennis Kingsley
Join Date: Mar 2009
Location: USA
Posts: 45
Rep Power: 17 |
I am running turbFoam right now on 23 nodes with dual quad core cpus, thats 184 cores.
Did you run the decompasePar for the new number of processes. |
|
November 19, 2008, 10:34 |
Ahh good to hear!
Yes I go th
|
#4 |
New Member
Dennis Lange
Join Date: Mar 2009
Posts: 8
Rep Power: 17 |
Ahh good to hear!
Yes I go through the steps: Editing decomposeParDict: ------------- numberOfSubdomains 6; simpleCoeffs { n (1 6 1); } metisCoeffs { processorWeights ( 1 1 1 1 1 1 ); } ------------- > blockMesh > setFields > decomposePar -force > mpirun --hostfile system/machines -np 6 interFoam -parallel >& log & |
|
November 19, 2008, 11:57 |
Did decomposePar give any mess
|
#5 |
Member
Dennis Kingsley
Join Date: Mar 2009
Location: USA
Posts: 45
Rep Power: 17 |
Did decomposePar give any messages about zero sized blocks or faces?
Most of the time I see OF1.5 hang on startup is due to decomposition errors or NIC failures. What is the topology/layout of the network for your nodes? Are you using the MPI libraries from OF1.5? |
|
November 20, 2008, 07:00 |
Thanks for your attention.
|
#6 |
New Member
Dennis Lange
Join Date: Mar 2009
Posts: 8
Rep Power: 17 |
Thanks for your attention.
decomposePar did not give any messages about zero sized blocks or faces. Every network interface doing well. The dual system has two network interfaces in the same subnet. The machines connected to a stellar GigaBit switch and the open MPI libs compiled out of the openFoam 1.5 gtgz's on Heron LTS I also try the turbFoam solver but the results are the same. If I use the single prozessor machines everything run smooth, but if I mix them up with the dual system the log file says: "create mesh for time = 0" an the processes runs on 100% CPU usage until i stop them. |
|
November 20, 2008, 09:06 |
Is you dual machine actually 2
|
#7 |
Member
Dennis Kingsley
Join Date: Mar 2009
Location: USA
Posts: 45
Rep Power: 17 |
Is you dual machine actually 2 cpu's or two cores, if it is two cores then the cpu=2 may not work.
my machine file is auto generated by the queue manaager Torque. However, I think you should have something like this in yours, node1 //single processor node2 //single processor node3 //single processor node4 //single processor node5 //dual processor node5 //dual processor To match up with the -np=6, don't add the comments to the machine file. |
|
November 21, 2008, 16:38 |
After a phone call with Bernha
|
#8 |
New Member
Dennis Lange
Join Date: Mar 2009
Posts: 8
Rep Power: 17 |
After a phone call with Bernhard Gschaider I put the Dualsystem aside and add a Quadcore with one network interface to the single CPU systems. This time every node doing well and I could take some measurements.
Thanx for your help and have a nice weekend! |
|
November 26, 2008, 05:57 |
Hi,
I solved the problem with
|
#9 |
New Member
Dennis Lange
Join Date: Mar 2009
Posts: 8
Rep Power: 17 |
Hi,
I solved the problem with the Dual CPU System. After stopping the second Network Interface with ifdown eth1 everything doing fine. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Eureka, OpenFoam 1.5 is running on OpenSUSE11.1 | Ahmed | Main CFD Forum | 0 | March 7, 2009 23:20 |
Optimal configuration for running OpenFOAM | bergantz | OpenFOAM Running, Solving & CFD | 0 | October 14, 2008 20:28 |
[OpenFOAM] ParaView 33 canbt open OpenFoam file | hariya03 | ParaView | 7 | September 25, 2008 18:33 |
Running OpenFOAM | Sean | Main CFD Forum | 1 | April 8, 2008 14:26 |
problem in running FoamX in Open FOAM | Gaurav | Main CFD Forum | 3 | May 10, 2006 06:06 |