DecomposeParDict inaccurate

Rasmusiwersen · April 5, 2019, 13:15

Hi all,

I am experiencing problems with my simulation when running serial and parallel simulations through decomposition.

Two identical cases have been run locally (in the terminal) and through a jobscript to a server at my university. The two identical simulations does not provide the same force output, and no changes are made between them besides the way the execution has been initiated. SPecifically the parallel run gives a force output which is a bit smaller compared to the serial run.

Does anyone of you have an idea why? I cant get my head around it.

Best
Rasmus

mAlletto · April 7, 2019, 08:22

It is quiete common to have small differences on two different architectures due to different treatment of roundoffs

Rasmusiwersen · April 8, 2019, 02:57

Quote:

Originally Posted by mAlletto

It is quiete common to have small differences on two different architectures due to different treatment of roundoffs

mAlletto, is that round off related to the boundary in the subdomains created with the parallel simulation? E.g. I have divided my domain into 6 subdomains. My guess is that the pressure resolution is what is wrong (since i have checked with different turbulence models with no difference), but i cannot get the difference between serial and parallel simulation to converge enough.

mAlletto · April 8, 2019, 04:35

Can you describe your case a bit more in detail? From the little information you provide it is quite difficult to judge what is wrong.

In general: you may expect difference between decomposed and single processor cases since for a decomposed cases the matrix you solve is less coupled

Best Michael

Rasmusiwersen · April 12, 2019, 03:48

Quote:

Originally Posted by mAlletto

Can you describe your case a bit more in detail? From the little information you provide it is quite difficult to judge what is wrong.

In general: you may expect difference between decomposed and single processor cases since for a decomposed cases the matrix you solve is less coupled

Best Michael

Most certainly.

My modelling domain is a rectangular box in which i have inserted a cube (0.5mx0.5m) in the middle. The domain is subjected to an oscillating flow. The force output i get is at the walls perpendicular to the flow direction, which has height 0.5m and width 0.5m. So this is the physical set up.

I have then copied the entire case to a new folder where every setting is maintained, i.e. i have changed nothing. For the first folder, i do following routine:

Delete following files in /constant/polymesh
cellLevel
cellZones
faceZones
level0Edge
pointLevel
pointZones
refinementHistory
surfaceIndex

cd ../..
blockMesh
surfaceFeatureExtract
snappyHexMesh -overwrite

From this point on, I run the case using the solver oscillatingPimpleFoam from the terminal.

In the copied case, following job_script is used:

#!/bin/sh
#PBS -S /bin/sh
#PBS -N CASENAME
#PBS -q hpc
#PBS -l nodes=1

pn=6
#PBS -l walltime=3:00:00
#PBS -M your_email_address
#PBS -m abe
NPROCS=`wc -l < $PBS_NODEFILE`
module load OpenFoam/2.2.2/gcc-4.7.2-openmpi
cd DIR

# Initialize dir

rm -rf 0/
cp -rf 0.org/ 0/
rm -rf constant/polyMesh
cp -rf constant/polyMesh.org constant/polyMesh

# Setup mesh in 3D
blockMesh
surfaceFeatureExtract
snappyHexMesh -overwrite
checkMesh > mesh.log
decomposePar -force
mpirun -np 6 oscillatingPimpleFoam -parallel > sim.log
reconstructPar
#sample
#rm -rf processor*

where casename and DIR are the name of the specific case and directory respectively.

The force time series output i get from the serial run compared to the same output file for the parallel run are compared. The force series typically looks sinusoidal.

At the peak of a "sine wave" the serial run is slightly larger compared to the parallel run which is what i dont understand.

I don't know if you have a different answer given the above description, I just wanted to check if this is a normal phenomena or if I am doing something wrong.

Thank you for your time!

mAlletto · April 13, 2019, 07:36

How big is the difference in the peak force?

But i would suggest that you run both single domain case and decomposed case on the compute nodes. In this way you reduce the parameters you change by one. I'm not sure if the architecture of the login nodes is always the same as on the compute nodes. The only difference in the scripts you submit should be

Code:

 decomposePar -force
mpirun -np 6 oscillatingPimpleFoam -parallel > sim.log

for the parallel run

and

Code:

 #decomposePar -force
  oscillatingPimpleFoam  > sim.log

for the single processor run.

By doing so you'll obtain the differences coming from running the case in parallel. But you do have to expect differences between i single run and a parallel run.

For a single run you solve one matrix for the whole domain. For the parallel run (let's say you run it on 6 processors) you solve 6 matrices which are only coupled at the processor boundaries via boundary condition. You can observe this loos coupling between the processors in the convergence: The single runs usually converge faster than the parallel runs.

Best

Michael

April 5, 2019, 13:15	DecomposeParDict inaccurate	#1
Rasmusiwersen Member Rasmus Iwersen Join Date: Jan 2019 Location: Denmark Posts: 81 Rep Power: 8	Hi all, I am experiencing problems with my simulation when running serial and parallel simulations through decomposition. Two identical cases have been run locally (in the terminal) and through a jobscript to a server at my university. The two identical simulations does not provide the same force output, and no changes are made between them besides the way the execution has been initiated. SPecifically the parallel run gives a force output which is a bit smaller compared to the serial run. Does anyone of you have an idea why? I cant get my head around it. Best Rasmus Last edited by Rasmusiwersen; April 5, 2019 at 15:37.

April 13, 2019, 07:36		#6
mAlletto Senior Member Michael Alletto Join Date: Jun 2018 Location: Bremen Posts: 616 Rep Power: 16	How big is the difference in the peak force? But i would suggest that you run both single domain case and decomposed case on the compute nodes. In this way you reduce the parameters you change by one. I'm not sure if the architecture of the login nodes is always the same as on the compute nodes. The only difference in the scripts you submit should be Code: decomposePar -force mpirun -np 6 oscillatingPimpleFoam -parallel > sim.log for the parallel run and Code: #decomposePar -force oscillatingPimpleFoam > sim.log for the single processor run. By doing so you'll obtain the differences coming from running the case in parallel. But you do have to expect differences between i single run and a parallel run. For a single run you solve one matrix for the whole domain. For the parallel run (let's say you run it on 6 processors) you solve 6 matrices which are only coupled at the processor boundaries via boundary condition. You can observe this loos coupling between the processors in the convergence: The single runs usually converge faster than the parallel runs. Best Michael

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
DecomposeParDict with weightField	mortezahdr	OpenFOAM	1	February 3, 2022 11:10
[snappyHexMesh] universal decomposeParDict	manuelffonseca	OpenFOAM Meshing & Mesh Conversion	0	March 31, 2017 09:36
decomposeParDict	foamiste	OpenFOAM Running, Solving & CFD	4	June 30, 2016 08:07
Environment variable ($MPI_PROCS) in decomposeParDict	Pj.	OpenFOAM Running, Solving & CFD	2	April 8, 2013 03:38
Inaccurate animation or inaccurate results?	niraj12321	ANSYS	1	August 10, 2010 14:33

April 7, 2019, 08:22		#2
mAlletto Senior Member Michael Alletto Join Date: Jun 2018 Location: Bremen Posts: 616 Rep Power: 16	It is quiete common to have small differences on two different architectures due to different treatment of roundoffs

April 8, 2019, 04:35		#4
mAlletto Senior Member Michael Alletto Join Date: Jun 2018 Location: Bremen Posts: 616 Rep Power: 16	Can you describe your case a bit more in detail? From the little information you provide it is quite difficult to judge what is wrong. In general: you may expect difference between decomposed and single processor cases since for a decomposed cases the matrix you solve is less coupled Best Michael