|
[Sponsors] |
March 3, 2009, 13:10 |
Dear forum,
I have some job
|
#1 |
Senior Member
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20 |
Dear forum,
I have some jobs running on our Opteron Myrinet cluster. Convergence is fine but jobs are dam slow. I see nearly no speedup to runs on ethernet workstations and I am wondering if OpenMPI uses Mrinet quite fine. I have built it with mx-Support. We have an average of about 5 min/Iteration (about 10-15 Pressure steps per Iterations, fine) whereas FLUENT needs about 30 seconds for an iteration on same CPU-Number and same mesh. |
|
March 3, 2009, 16:04 |
Are you sure the processes wer
|
#2 |
Senior Member
Chris Sideroff
Join Date: Mar 2009
Location: Ottawa, ON, CAN
Posts: 434
Rep Power: 22 |
Are you sure the processes were distributed to the cluster nodes and they are not all running the node you launched mpirun from?
If this is a linux cluster, log into one of the remote nodes you explicitly told them to run on and check the running processes using top or ps. If you are using a queuing/scheduling software (PBS, SGE, etc) find where it sent the processes and perform the preceding. |
|
March 3, 2009, 16:58 |
Hi Chris,
yes I did that. T
|
#3 |
Senior Member
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20 |
Hi Chris,
yes I did that. They run as they should. The only thing I am doing so far is not distributing the data to the local nodes but running this from a nfs-share. However, this should only influence writing-time of backup-data. Difference to FLUENT is OpenMPI vs. HP-MPI. Regards. |
|
March 4, 2009, 10:16 |
OpenMPI will not have Myrinet
|
#4 |
Senior Member
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 21 |
OpenMPI will not have Myrinet support by default. You will have to recompile OpenMPI with Myrinet support for it to work properly. Or just use HP-MPI, that works too (although you will have to buy a licence).
|
|
March 4, 2009, 10:41 |
Eugene,
I know this I added
|
#5 |
Senior Member
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20 |
Eugene,
I know this I added the OpenMPI-Myrinet Support. If I run on our workstations (no Myrinet) I get an error about missing myrinet-modules. I do not get this error on our cluster where myrinet is present. However, performance is poor and I get not feedback (except missing error-message) if myrinet is used but I suppose not. |
|
March 5, 2009, 04:59 |
Here are zwo Iterations from t
|
#6 |
Senior Member
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20 |
Here are zwo Iterations from the log:
Time = 71 DILUPBiCG: Solving for Ux, Initial residual = 0.000235829, Final residual = 4.2349e-06, No Iterations 2 DILUPBiCG: Solving for Uy, Initial residual = 0.00219142, Final residual = 6.32653e-05, No Iterations 2 DILUPBiCG: Solving for Uz, Initial residual = 0.00128352, Final residual = 1.65462e-05, No Iterations 2 GAMG: Solving for p, Initial residual = 0.00667156, Final residual = 5.49445e-06, No Iterations 9 time step continuity errors : sum local = 3.54209e-06, global = -1.48371e-07, cumulative = -0.00037855 DILUPBiCG: Solving for epsilon, Initial residual = 0.067249, Final residual = 1.9244e-10, No Iterations 1 bounding epsilon, min: -100901 max: 1.417e+09 average: 32636.2 DILUPBiCG: Solving for k, Initial residual = 2.33946e-06, Final residual = 2.33946e-06, No Iterations 0 ExecutionTime = 19028.4 s ClockTime = 19082 s Time = 72 DILUPBiCG: Solving for Ux, Initial residual = 0.000234464, Final residual = 4.19113e-06, No Iterations 2 DILUPBiCG: Solving for Uy, Initial residual = 0.00216742, Final residual = 6.50005e-05, No Iterations 2 DILUPBiCG: Solving for Uz, Initial residual = 0.00127756, Final residual = 1.62209e-05, No Iterations 2 GAMG: Solving for p, Initial residual = 0.00666254, Final residual = 5.60993e-06, No Iterations 9 time step continuity errors : sum local = 3.61005e-06, global = -1.35679e-07, cumulative = -0.000378685 DILUPBiCG: Solving for epsilon, Initial residual = 0.0692427, Final residual = 2.30982e-10, No Iterations 1 bounding epsilon, min: -46613.7 max: 1.40629e+09 average: 32421.4 DILUPBiCG: Solving for k, Initial residual = 2.45671e-06, Final residual = 2.45671e-06, No Iterations 0 ExecutionTime = 19214.4 s ClockTime = 19268 s This is more than 3 Minutes for one iteration. Is this ok for a case with about 26 Million cells running on 32 Opteron CPU 2220 with Myrinet-Interconnect. I feel it is much to slow.. Regards BastiL |
|
March 5, 2009, 05:22 |
Ok, this problem was caused by
|
#7 |
Senior Member
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20 |
Ok, this problem was caused by insufficient solver settings.
Regards |
|
March 5, 2009, 17:17 |
BastiL, would you mind sharing
|
#8 |
Member
Johan Spång
Join Date: Mar 2009
Location: Stockholm, Sweden
Posts: 35
Rep Power: 17 |
BastiL, would you mind sharing what you had to change in the solver settings?
Regards |
|
March 5, 2009, 17:38 |
Yes I had nIterFinestLevel for
|
#9 |
Senior Member
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20 |
Yes I had nIterFinestLevel for the preconditioner set to high value.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
192 CPU HPC Cluster Available | Steve Booth | Main CFD Forum | 0 | September 24, 2007 15:05 |
which cluster OS? | jojo | Main CFD Forum | 6 | September 7, 2007 15:31 |
cluster | martin | Siemens | 2 | November 8, 2005 20:24 |
PC-cluster | Chris | Main CFD Forum | 14 | January 27, 2000 19:40 |
CFD on PC (cluster) | Bambang I. Soemarwoto | Main CFD Forum | 2 | October 9, 1998 14:12 |