|
[Sponsors] |
December 4, 2023, 05:35 |
SU2 not running with 8 and more nodes.
|
#1 |
New Member
Pravin
Join Date: Dec 2023
Posts: 2
Rep Power: 0 |
Hello everyone,
I am encountering an issue with SU2 during runtime. I have successfully compiled SU2 using Intel One API 2023 on operating system version 9.2. However, I face a problem running SU2 on 8 and 16 nodes. The specific issue is that SU2 fails to generate any output; it becomes unresponsive. I have attempted to run it with Intel OneAPI versions 2021, 2022, and 2023, but the problem persists. Notably, SU2 functions properly when using OpenMPI with the gnu compiler. SU2 runs successfully on a single node and it generates outputs. SU2 works fine up to 7 nodes but it hangs and does not generate output on 8 and 16 nodes. Please help me to resolve this issue. Thank you, Pravin Last edited by pravin; December 5, 2023 at 02:40. |
|
December 4, 2023, 08:30 |
|
#2 |
Senior Member
bigfoot
Join Date: Dec 2011
Location: Netherlands
Posts: 676
Rep Power: 21 |
So the problem only occurs when compiling with this specific compiler? You say that it works when using openmpi, is that then with the gnu compiler?
|
|
December 4, 2023, 09:22 |
|
#3 |
New Member
Pravin
Join Date: Dec 2023
Posts: 2
Rep Power: 0 |
Yes, using GNU compiler
|
|
December 19, 2023, 06:52 |
Request for help
|
#4 |
New Member
Join Date: Dec 2023
Posts: 2
Rep Power: 0 |
Hello All
I am having same issue here, on an HPC env , slurm is used to submit jobs , had no issues with other jobs , but for running SU2 MPI job , its hanging for more than 7 nodes .. does anyone have somilar experience ? |
|
December 19, 2023, 08:19 |
|
#5 |
Senior Member
bigfoot
Join Date: Dec 2011
Location: Netherlands
Posts: 676
Rep Power: 21 |
When you are saying nodes, do you actually mean cores? HPC usually have multiple cores per node, so an issue could arise when the cores used in the job are distributed over multiple nodes and the communication between nodes is faulty. Maybe you can check by forcing your scheduler to run on 2cores/1node and 2cores/2nodes.
Since this does not seem to be a common issue (I do not have any issues running on multiple cores or nodes), it might be due to your specific compiler and mpi version, so I would just try some different compilers and mpi versions to see if it is due to some specific version. |
|
December 19, 2023, 08:25 |
|
#6 |
New Member
Join Date: Dec 2023
Posts: 2
Rep Power: 0 |
Yes , by nodes I means physical nodes , scheduler is configured to use max cores , and also on the job batch defined to use 64 cores from each node as it is ,
And yes it is about compiler , because it is working fine with OpenMPI4 , but we want to use intelOneApi and it is not producing outputs |
|
|
|