|
[Sponsors] |
Performance Issue - flmpi stops CX process busy |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
October 10, 2023, 15:49 |
Performance Issue - flmpi stops CX process busy
|
#1 |
New Member
Robert Schmitt
Join Date: Apr 2023
Posts: 10
Rep Power: 3 |
Good evening all,
I am hoping that someone can help me with direction with trying to figure out why a certain model may be performing slowly. While solving (local single machine or over msmpi/ms job scheduler) the fl_mpi23* process works for a few seconds, but then stops until the next iteration. In this "wait" period, the cx23* process is madly busy doing I don't know what - is there any way I can figure out what it is in fact trying to do? Things we have tried:
All other benchmarks that we have tried is performing 100%. Our system is 12 x nodes of dual 44-core Intel Xeon 6152 cpus with 6 memory channels per CPU. Storage is SSD. Network is 100gb IB. The "Pausing" is reflected in the performance timer as the iteration supposedly takes 4 seconds but 30 iterations take 1276 seconds. No errors in fluent output at all showing something is wrong. Disk queues empty. It seems that the "average wall-clock time per iteration" scales accordingly to number of nodes/cores given to the job, however the waiting still stays there. If anyone can maybe suggest how we can figure out the root cause I'd really appreciate it. Code:
Performance timer output: Performance Timer for 30 iterations on 240 compute nodes Average wall-clock time per iteration: 4.032 sec Global reductions per iteration: 443 ops Global reductions time per iteration: 0.000 sec (0.0%) Message count per iteration: 760147 messages Data transfer per iteration: 3612.734 MB LE solves per iteration: 5 solves LE wall-clock time per iteration: 0.607 sec (15.1%) LE global solves per iteration: 2 solves LE global wall-clock time per iteration: 0.025 sec (0.6%) LE global matrix maximum size: 355 AMG cycles per iteration: 6.000 cycles Relaxation sweeps per iteration: 436 sweeps Relaxation exchanges per iteration: 0 exchanges LE early protections (stall) per iteration: 0.000 times LE early protections (divergence) per iteration: 0.000 times Total SVARS touched: 398 DPM updates per iteration: 0.5000 updates DPM wall-clock time per iteration: 1.334 sec (33.1%) Time-step updates per iteration: 0.50 updates Time-step wall-clock time per iteration: 2.448 sec (60.7%) Total wall-clock time: 120.947 sec Total dpm solve time: 40.029 sec Total dpm i/o time: 0.000 sec Simulation wall-clock time for 30 iterations 1276 sec |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Parallel fluent process exiting before reading the mesh file | harsha2398 | FLUENT | 3 | December 29, 2020 01:58 |
The fl process could not be started because of UDF | majid_kamyab | Fluent UDF and Scheme Programming | 6 | December 15, 2015 09:42 |
[Netgen] meshing process stops (after surf meshing?) | vaina74 | OpenFOAM Meshing & Mesh Conversion | 5 | May 27, 2011 03:32 |
Residual issue in steady state process | MASOUD | FLUENT | 2 | March 31, 2010 01:53 |
Why the solution process stops? | Julie Polyakh | Siemens | 4 | February 20, 2003 09:18 |