CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

parallelised OpenFoam solver on diverges earlier when the number of cores are higher

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 24, 2018, 10:43
Default parallelised OpenFoam solver on diverges earlier when the number of cores are higher
  #1
Member
 
Foad
Join Date: Aug 2017
Posts: 58
Rep Power: 9
foadsf is on a distinguished road
Hi foamers,

I have a model which I'm trying to solve on a cluster (zip file in the attachment). When I divide the problem into 7 segments it solves till 6.74 but when I have 50 segments it stops at 0.72! In fact this is quite random. but in general the higher the number of cores the faster by average it will stop.

I run the problem with:
Code:
blockmesh
decomposePar
srun -N 1 --ntasks-per-node=7 --pty bash
mpirun -np 7 sonicFoam -parallel -fileHandler uncollated > log.log 2>&1
reconstructPar
and these are a sample of errors I get:

Code:
[21] --> FOAM FATAL ERROR: 
[21] Maximum number of iterations exceeded: 100
[21] 
[21]     From function Foam::scalar Foam::species::thermo<Thermo, Type>::T(Foam::scalar, Foam::scalar, Foam::scalar, Foam::scalar (Foam::species::thermo<Thermo, Type>::*)(Foam::scalar, Foam::scalar) const, Foam::scalar (Foam::species::thermo<Thermo, Type>::*)(Foam::scalar, Foam::scalar) const, Foam::scalar (Foam::species::thermo<Thermo, Type>::*)(Foam::scalar) const) const [with Thermo = Foam::hConstThermo<Foam::perfectGas<Foam::specie> >; Type = Foam::sensibleInternalEnergy; Foam::scalar = double; Foam::species::thermo<Thermo, Type> = Foam::species::thermo<Foam::hConstThermo<Foam::perfectGas<Foam::specie> >, Foam::sensibleInternalEnergy>]
[21]     in file /home/foobar/OpenFOAM/OpenFOAM-v1712/src/thermophysicalModels/specie/lnInclude/thermoI.H at line 73.
[21] 
FOAM parallel run aborting
continued with:

Code:
[21] #5  ? at ??:?
[30] #5  ? at ??:?
[28] #5   at ??:?
[49] #4  Foam::hePsiThermo<Foam::psiThermo, Foam::pureMixture<Foam::constTransport<Foam::species::thermo<Foam::hConstThermo<Foam::perfectGas<Foam::specie> >, Foam::sensibleInternalEnergy> > > >::correct()? at ??:?
[5] #5   at ??:?
[36] #5  ?? at ??:?
[43] #5   at ??:?
[32] #6  __libc_start_main at ??:?
[45] #4   at ??:?
[44] #5  ? at ??:?
[35] #5   at ??:?
[40] #5   at ??:?
[47] #5   at ??:?
[27] #5   at ??:?
[31] #5   at ??:?
[38] #6  __libc_start_main? at ??:?
[11] #5   at ??:?
[41] #5   at ??:?
[25] #5   at ??:?
[42] #6   at ??:?
[34] #6  __libc_start_main__libc_start_main? at ??:?
[24] #5   at ??:?
[9] #5   at ??:?
[1] #5   at ??:?
[48] #5   in "/lib/x86_64-linux-gnu/libc.so.6"
[3] #7   at ??:?
[29] #5   at ??:?
[10] #5   at ??:?
[26] #6  __libc_start_main at ??:?
[23] #6  __libc_start_main? at ??:?
[33] #5  ?? at ??:?
[4] #5  ?? in "/lib/x86_64-linux-gnu/libc.so.6"
[32] #7  ??? at ??:?
[39] #5  ? at ??:?
[30] #6  __libc_start_main? at ??:?
[46] #5  ? in "/lib/x86_64-linux-gnu/libc.so.6"
[38] #7  ? at ??:?
[37] #6  __libc_start_main in "/lib/x86_64-linux-gnu/libc.so.6"
[42] #7  ? at ??:?
[28] #6  __libc_start_main? in "/lib/x86_64-linux-gnu/libc.so.6"
[26] #7  ?? in "/lib/x86_64-linux-gnu/libc.so.6"
?[34] #7   at ??:?
[21] #6  __libc_start_main????? at ??:?
[43] #6  __libc_start_main? in "/lib/x86_64-linux-gnu/libc.so.6"
[23] #7   at ??:?
[5] #6  __libc_start_main at ??:?
[47] #6  __libc_start_main at ??:?
[44] #6  __libc_start_main at ??:?
[35] #6  __libc_start_main at ??:?
[45] #5  __libc_start_main at ??:?
[27] #6  __libc_start_main in "/lib/x86_64-linux-gnu/li?bc.so.6"
[30] #7  ?? in "/lib/x86_64-linux-gnu/libc.so.6"
[37] #7   at ??:?
[49] #5  ?? at ??:?
 in "/lib/x86_64-linux-gnu/libc.so.6"
[28] #7   at ??:?
[24] #6  __libc_start_main--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
 at ??:?
[36] #6  __libc_start_main at ??:?
[1] #6  __libc_start_main at ??:?
[40] #6  __libc_start_main? at ??:?
 at ??:?
[33] #6  __libc_start_main[10] #6  __libc_start_main at ??:?
[31] #6  __libc_start_main at ??:?
[48] #6  __libc_start_main at ??:?
[29] #6  __libc_start_main at ??:?
[9] #6  __libc_start_main in "/lib/x86_64-linux-gnu/libc.so.6"
 in "/lib/x86_64-linux-gnu/libc.so.6"
[27] #7  [43] #7   at ??:?
[39] #6  __libc_start_main in "/lib/x86_64-linux-gnu/libc.so.6"
[5] #7   at ??:?
[25] #6  __libc_start_main at ??:?
[4] #6   at ??:?
[41] #6  __libc_start_main__libc_start_main in "/lib/x86_64-linux-gnu/libc.so.6"
[47] #7  ?? at ??:?
 at ??:?
[11] #6  __libc_start_main in "/lib/x86_64-linux-gnu/libc.so.6"
[44] #7   at ??:?
 in "/lib/x86_64-linux-gnu/libc.so.6"
[35] #7   at ??:?
?? at ??:?
[weleveld:20500] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
[weleveld:20500] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
The complete error message can be found in the attachments. (divided into 3 files, download all and remove the .txt at the end and unzip)

I would appreciate if you could let me know what is the problem and how I can solve it.
Attached Files
File Type: zip 20180823.zip (7.3 KB, 2 views)
File Type: txt log.zip.001.txt (190.0 KB, 2 views)
File Type: txt log.zip.002.txt (190.0 KB, 2 views)
File Type: txt log.zip.003.txt (152.8 KB, 2 views)
foadsf is offline   Reply With Quote

Old   August 29, 2018, 14:12
Default
  #2
Senior Member
 
Michael Alletto
Join Date: Jun 2018
Location: Bremen
Posts: 616
Rep Power: 16
mAlletto will become famous soon enough
I think this is not unusual since with increasing number of processors the solution becomes more and more decoupled
mAlletto is offline   Reply With Quote

Old   August 29, 2018, 19:17
Default
  #3
Member
 
Foad
Join Date: Aug 2017
Posts: 58
Rep Power: 9
foadsf is on a distinguished road
Quote:
Originally Posted by mAlletto View Post
I think this is not unusual since with increasing number of processors the solution becomes more and more decoupled
So what is the best solution?
foadsf is offline   Reply With Quote

Old   September 5, 2018, 03:53
Default
  #4
Senior Member
 
TWB
Join Date: Mar 2009
Posts: 414
Rep Power: 19
quarkz is on a distinguished road
Hi, I have a similar problem but opposite of yours - Running at 288 cores, it diverges but at 600 cores, I got a converged solution:

Problem diverges or not depending on procs number
quarkz is offline   Reply With Quote

Reply

Tags
diverge, mpi, openfoam, slurm, sonicfoam


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
LES, Courant Number, Crash, Sudden Alhasan OpenFOAM Running, Solving & CFD 5 November 22, 2019 03:05
Star cd es-ice solver error ernarasimman STAR-CD 2 September 12, 2014 01:01
Cluster ID's not contiguous in compute-nodes domain. ??? Shogan FLUENT 1 May 28, 2014 16:03
DecomposePar unequal number of shared faces maka OpenFOAM Pre-Processing 6 August 12, 2010 10:01
Unaligned accesses on IA64 andre OpenFOAM 5 June 23, 2008 11:37


All times are GMT -4. The time now is 05:54.