|
[Sponsors] |
![]() |
![]() |
#1 |
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 8 ![]() |
We have a 3D unstructured grid serial code which we are trying to parallelize using MPI. For the cases involving shocks we have problems of convergence when using a 2nd order upwind scheme. The code is blowing up.
We have tried communicating the higher order reconstructed values at the interface faces. But there are oscilations along the shock in case of parallel code w.r.t the serial code. Can anyone suggest a possible solution to the issue? Thanks. |
|
![]() |
![]() |
![]() |
![]() |
#2 |
Senior Member
Selig
Join Date: Jul 2016
Posts: 213
Rep Power: 11 ![]() |
When you say "blow up" are you referring to numerical blow up or your simulation crashing due to memory leaks? If its due to blow up, I would confirm that your serial code under the exact same conditions to ensure its not an implementation issue. In terms of debugging MPI codes I would use TotalView, it is pretty much the standard.
|
|
![]() |
![]() |
![]() |
![]() |
#3 |
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 8 ![]() |
The blow-up is a numerical blow-up. The parallel code converges for a few time step iterations and then diverges. We have set the same conditions in both the cases. The first order accurate simulations converge well. We have issues in 2nd order accurate convective scheme implementation.
|
|
![]() |
![]() |
![]() |
![]() |
#4 |
Senior Member
Join Date: Oct 2011
Posts: 242
Rep Power: 17 ![]() |
Did you check at first order if you have exact same results (up to machine accuracy) between serial and // code ? You did not mention it.
If so, then second order. You mention that you have a communication point for the limited recontructed states, I guess prior to the Riemann solver. How do you compute the reconstructed states ? Assuming that first you compute the gradients with Green-Gauss or Least-Squares, both methods needs average values coming from a neighborhood. Did you communicate these average values prior gradient computation ? Depending on your implementation, same remark may hold for the limiter, for example BJ limiter needs Min and Max over a neighborhood. |
|
![]() |
![]() |
![]() |
![]() |
#5 |
Senior Member
Join Date: Dec 2017
Posts: 153
Rep Power: 8 ![]() |
Hi,
As people here is writing, you have a bug in the parallel code. Trust me, I have done the same job with my code at the resulta between the two codes must be the same. Since for the second order you need the upwind and the downwind cell, have you cheked that the ghost nodes are exchanged correctly? Naturally this is just a guess! |
|
![]() |
![]() |
![]() |
![]() |
#6 |
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 8 ![]() |
Thanks for the replies.
I would like to clarify that we are getting a maximum error of about 3.47 x 10^-3 percentage in case of first order calculations while using 16 processors. I have attached the images of results showing the same for both 1st order and second order results. Is this fine? We are having a code with Implicit Jacobians solved using LU-SGS method and we haven't modified the Jacobian part till now for parallel computations. Regarding the communication of reconstructed values, we communicate the solution at interface cells once we get the solution at a new time step, then we use these to find average values of the variables. These are then used to calculate Green-Gauss Gradient at all cell centres. Then we calculate the limited reconstructed values at faces, which are communicated prior to solving for the solution at the next time set. We use the Venkatkrishnan limiter and we are calculating minimum and maximum across the neighbourhood considering the neighbour cells in other zones also, which does not change the result. |
|
![]() |
![]() |
![]() |
![]() |
#7 |
Senior Member
Join Date: Dec 2017
Posts: 153
Rep Power: 8 ![]() |
Hi,
Unfortunatly a bug in mpi is very tedious to find. In my case for example I have done a contour of the central coeff of the matrices for the serial and the parallel version and the same thing for the rhs. Then I have looked at this results togwther. With these method you can see immeditlely if your equations are the same and I mean exactly the same at the 16 decimal digit ![]() ![]() |
|
![]() |
![]() |
![]() |
![]() |
#8 |
Senior Member
Join Date: Oct 2011
Posts: 242
Rep Power: 17 ![]() |
Your first results are in my opinion NOT fine. Even if it may look almost the same on the graphs, the error is much higher than real precision.
Then before switching to second order, you should first fix it with first order. Can you switch your temporal scheme to a simple explicit time integrator like 1st order Euler scheme for testing purpose ? If it works, then you add once at a time new ingredients, implicit, 2nd order etc edit: you say you fill in ghost cells at the END of the time step. Is the initial solution communicated for the first time step ? |
|
![]() |
![]() |
![]() |
![]() |
#9 |
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 8 ![]() |
Dear naffrancois,
As per your suggestion, I tried running the same case using explicit Euler time stepping and it gives exact match(13 decimal places) in case of first order calculations. Please see the attached results. But once I use the 2nd order limiter, it gives a result similar to the one in case of implicit code even with explicit Euler time stepping. I tried running explicit code with a limiter value of 1.0, giving match up to 11-12 decimal places suggesting that the error is not in the gradient calculations. |
|
![]() |
![]() |
![]() |
![]() |
#10 |
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 8 ![]() |
We have finally fixed a bug in the 2nd order convective flux limiter calculation, thus serial and parallel results of explicit Euler time-stepping are matching up to decimal places of machine precision.
Now we have a slight difference in the results of serial and parallel codes in case of implicit time-stepping using LU-SGS solver for the same case. Will it be because of a mismatch in the calculation in the LU-SGS solver caused due to decomposition of the computational domain in the parallel code? Kindly suggest a way to fix it. |
|
![]() |
![]() |
![]() |
![]() |
#11 |
Senior Member
Join Date: Oct 2011
Posts: 242
Rep Power: 17 ![]() |
Well you have done a big step finding the bug for the explicit solver. I am not familiar with LUSGS time stepping, but clearly extra care has to be taken. I parallelized not so long time ago an unstructured implicit scheme using 1st order in space Jacobian with gmres, I did it that way: each cpu owns a submatrix of size Nelem*Nelem_overlap, like an augmented matrix. In krylov linear solvers, typical operations involve matrix-vector products Jac*x=b and scalar products. Communication of x vector is then necessary prior matrix-vector product and sum reduction (MPI_ALLREDUCE) for scalar products.
I dont know how this applies to LUSGS, maybe you forgot contributions coming from ghost cells in the Upper part ? By the way I assume you are testing LUSGS with 1st order |
|
![]() |
![]() |
![]() |
![]() |
#12 |
New Member
Nived M R
Join Date: May 2018
Posts: 6
Rep Power: 8 ![]() |
Right now we are not using any MPI Functions inside the LU-SGS solver. We are working out a possible solution to account for missing neighbour cell information in the LU-SGS Matrix in all the processes.
|
|
![]() |
![]() |
![]() |
![]() |
#13 |
Senior Member
Join Date: Oct 2011
Posts: 242
Rep Power: 17 ![]() |
I would say as long as you have off-diagonal blocks into your matrix, e.g. not a block-diagonal matrix, means that you have for each element neighbor information. Then at a certain point in your LUSGS solver you will need MPI calls to synchronize values between the partitions. Anytime you need values coming from a neighbor element you have to ask yourself if this value may belong to a different CPU.
|
|
![]() |
![]() |
![]() |
![]() |
#14 | |
Senior Member
Join Date: Dec 2017
Posts: 153
Rep Power: 8 ![]() |
Quote:
|
||
![]() |
![]() |
![]() |
Tags |
mpi |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Doubt regarding MUSCL scheme implementation and associated limiters | SangeethCFD | Main CFD Forum | 0 | October 7, 2017 10:00 |
mpirun, best parameters | pablodecastillo | Hardware | 18 | November 10, 2016 13:36 |
[OpenFOAM.org] MPI compiling and version mismatch | pki | OpenFOAM Installation | 7 | June 15, 2015 17:21 |
Error using LaunderGibsonRSTM on SGI ALTIX 4700 | jaswi | OpenFOAM | 2 | April 29, 2008 11:54 |
Is Testsuite on the way or not | lakeat | OpenFOAM Installation | 6 | April 28, 2008 12:12 |