|
[Sponsors] |
February 17, 2011, 16:20 |
Different results from AMD and Intel machine
|
#1 |
New Member
Join Date: Feb 2011
Posts: 6
Rep Power: 15 |
Hi,
I hope anyone can help me out here: I got slightly different numerical results from AMD core machine and Intel core machine by using exactly the same code (MPI parallel). For serial running the difference is much smaller and only on some single cells for about 8th- or 9th decimal digits, but still there. Debugging seems no problem or not get the root cause yet. Anyone here had such experience or any idea? Thanks. Michael Last edited by MichaelCFD; February 17, 2011 at 16:37. |
|
February 17, 2011, 16:23 |
|
#2 |
New Member
Join Date: Feb 2011
Posts: 6
Rep Power: 15 |
btw, the code is in C. Thanks.
|
|
February 17, 2011, 19:05 |
|
#3 |
Senior Member
|
Hi Michael,
It is not uncommon to not have exactly the same results on different architectures/OS. For exmaple, the following looks identical, but may be treated slightly differently: double c = 0; double c = 0.d0; I also assume that there is no random number generator in the code. The main question is: Does the differences affect the results at convergence significantly? Regards, Julien
__________________
--- Julien de Charentenay |
|
February 17, 2011, 19:22 |
|
#4 |
Senior Member
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19 |
Is the code single precision or double precision?
Are they the same executable or did you recompile the code on each machine? If you recompiled it, what level of optimization did you use? What type of solver is it, structured, unstructured, implicit, explicit? Is the solution steady or unsteady? If steady, did you converge it to machine zero? If unsteady, at what point do you see the difference build up? You mentioned MPI. Does this mean you are using multiple machines or just one machine with multiple cores? Also, how is your domain broken up? For example, I would expect an implicit structured chimera solver to converge differently depending on how the overall grid is parsed and passed out among the various cores. |
|
February 21, 2011, 11:31 |
different using the same executable
|
#5 |
New Member
Join Date: Feb 2011
Posts: 6
Rep Power: 15 |
Thanks for all your responses and helps.
I use the same executable compiled on a Intel machine. No random number. Also I tested to output (printf) the following as suggested: double a=0; double a=0.d0; The output is the same from AMD and Intel for many decimal digits. The convergence is fine for both. and parallel part doing the exactly the same thing. The problem is that even for serial running, there are digit difference fr om the two machines. I guess if any one had tested code like this. Maybe I should grep another code to test... |
|
February 21, 2011, 13:37 |
|
#6 |
Senior Member
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19 |
Assuming your code is double precision, both codes are doing exactly the same thing, and the case is steady state.
1) Euler should be run first for comparison, then laminar, then turbulent. 2) The results should be converged to machine zero. In general, the residual should be in the vicinity of 1.0e-15 and 1.0e-16. 3) Assuming you don't have some very fine cells, or cells with poor Jacobians, the differences in state variables (i.e. rho, rho-v, p, etc) between the two codes should be less than 1.0e-10, IMO. 4) Integrated load values, such as lift and drag, can be off by much more since they depend on integrating pressure differences. |
|
February 22, 2011, 13:19 |
I use the same executable for AMD and Intel machines
|
#7 |
New Member
Join Date: Feb 2011
Posts: 6
Rep Power: 15 |
Initially the difference is very small, like 1.e-14, but as iterations go on, it becomes significantly large... Did anyone here test your own code on different machines? Thanks.
|
|
February 22, 2011, 13:36 |
|
#8 |
Senior Member
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19 |
Yes I have, and the answer you are looking for depends on the case you are running, the solution methodology, the grid, and what you are comparing. A simple answer does not exist.
You haven't given enough details to help address your question. |
|
February 23, 2011, 22:51 |
|
#9 |
New Member
Join Date: Feb 2011
Posts: 6
Rep Power: 15 |
So did you find any difference in results from different machine? if any difference, what is the cause? My code is a typical cfd code: unsteady or steady, fv, ... But I do not think those factors should make such machine difference... Thanks.
|
|
February 23, 2011, 23:57 |
|
#10 |
Senior Member
Martin Hegedus
Join Date: Feb 2011
Posts: 500
Rep Power: 19 |
For an unsteady result, once the results diverge, even by an epsilon, they will continue to diverge. So, in that case, a difference of 1e-8 in a field value isn't anything special.
However, my general experience is that field values for steady results converged to machine zero are within 1e-12 between amd and intel for solutions on high quality grids. I take notice if the results differ by more than 1e-10. My experience is that, in general, when this is the case, the probability is significant that there is a bug in the code. However, this is only true if the solution is independent of the number of cores solving the problem. For example, the probability is high that the solution during convergence for a steady problem from an implicit method is dependent on the number of cpu cores solving the problem. This difference should diminish as the solution converges to machine zero, assuming that the right hand side is independent of how the problem is broken up for the various cores. But, it is also important to take into account nonlineararities of the flow being analyzed. Epsilon changes in a shock or a vortex can cause noticeable differences in other areas of the flow field. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
CFX11 + Fortran compiler ? | Mohan | CFX | 20 | March 30, 2011 19:56 |
OpenFOAM 13 Intel quadcore parallel results | msrinath80 | OpenFOAM Running, Solving & CFD | 13 | February 5, 2008 06:26 |
OpenFOAM 13 AMD quadcore parallel results | msrinath80 | OpenFOAM Running, Solving & CFD | 1 | November 11, 2007 00:23 |
AMD X2 & INTEL core 2 are compatible for parallel? | nikolas | FLUENT | 0 | October 5, 2006 07:49 |
INTEL vs. AMD | Michael Bo Hansen | CFX | 9 | June 19, 2001 17:54 |