|
[Sponsors] |
October 3, 2021, 06:51 |
Double precision cpu
|
#1 |
New Member
Join Date: Aug 2021
Posts: 22
Rep Power: 5 |
When I clicked a button to turn on simulation in Abaqus, I watched warning in which software recommended running simulation in double precision FP64 because software need more than 20 milions iteration to do simulation.
So I would like to ask which for example 4-5 cpu should have top performance of double precision FP64 ? |
|
October 3, 2021, 07:37 |
|
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
CPUs aren't GPUs
That is to say, the CPU market is not segmented into product lines with particularly high or low FP64 performance. Assuming the code is not perfectly vectorized (which most software isn't) the main difference between running FP64 vs FP32 is an increase in memory bandwidth requirement. So you can get a slight increase in performance by making sure you get a CPU with a lot of memory channels. But the performance difference will be rather small compared to the factors of 16, 32 or even 64 we see in GPU floating point performance. |
|
October 3, 2021, 08:46 |
|
#3 |
New Member
Join Date: Aug 2021
Posts: 22
Rep Power: 5 |
"That is to say, the CPU market is not segmented into product lines with particularly high or low FP64 performance"
Yes but for example low end cpu's will have less FP64 performance than high end . I need to know which cpu's have the highest FP64 performance , for example epyc milan like 7313p or Xeon Silver ? I need to know 4-5 cpu's with highest FP64 performance and they will be tested by shop in which I will buy CPU and motherboard.' If for example simulation in Abaqus will start only if I use FP64 , I will have to use FP64 . I dont want use FP64 but in some case I will have to so I should buy cpu with the best FP64 performance. Do you think epyc 7313p will has better fp6 performance than Xeon Silver 4316 ? |
|
October 3, 2021, 12:47 |
|
#4 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
Let me reiterate that: "FP64" isn't anything special for a CPU. Every time I or someone else here talks about performance of CPUs in general terms, that also includes floating point calculations with 64-Bit variables. Matter of fact, when running a CFD solver in double precision, memory bandwidth limitations may become more pronounced.
So you don't need any special advice for a system that can handle FP64 particularly well. All the information is already there. Abaqus itself is a different story entirely. Their "standard" and "explicit" solvers behave quite differently. Explicit is more akin to a well-behaved CFD solver. Scales nicely even on distributed parallel systems. Standard is...different. I pieced together that they have something along the lines of a hybrid OpenMP-MPI parallelization going on. Support wasn't able to tell me what exactly it is, or how to control stuff like core binding and mapping. Without any additional arguments, you only get the OpenMP type of parallelization, and it doesn't scale as nicely as explicit. The only upside of this solver -from the perspective of someone who doesn't use it- is that GPU acceleration works exceptionally well. I urge you to take a step back before you start throwing money at the problem. For example, standard and explicit solvers both have their uses for specific types of problems. E.g. explicit can be better for dynamic problems like crash simulation. If the solver decides that 20 million iterations are necessary, maybe you are using the wrong solver. Or there might be another solver setting that needs to be addressed: https://info.simuleon.com/blog/7-tip...qus-run-faster I can't tell you which one, you would need an expert in Abaqus for that. All I know is that I wold carefully re-evaluate my modelling approach if it turned out that my CFD simulation requires 20 million iterations. Because I know that no amount of money spent on hardware will get me through that. |
|
November 14, 2021, 09:50 |
|
#5 |
New Member
Join Date: Aug 2021
Posts: 22
Rep Power: 5 |
So you mean processos have the same performance using single and double precision ?
For example if processor has 76 tflops in single precision so in double precision also processor will has 76 tflops ? |
|
November 15, 2021, 03:33 |
|
#6 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
Not necessarily the same as in identical FLOPS for FP32 and FP64. That's a rather complicated topic, and I can't say that I understand every minute detail. EUs can be multi-purpose, doing either one FP64 calculation, or two FP32 calculations in the same amount of time.
That means worst-case is 2:1 FLOPS for FP32 vs. FP64. But most CPUs these days are very similar in that regard, with none being particularly well-suited for either type of calculation. That's all really theoretical anyway, since real software won't run anywhere near the limit for floating point calculations of a CPU. Especially for parallel CFD and FEM, other bottlenecks are hit before that, due to low computational intensity, and data access which is never fully predictable by the prefetcher. The cache and memory subsystem are limiting factors, preventing the FPUs from operating at their maximum theoretical throughput. See e.g. here what it takes to get a CPU to operate near its theoretical FLOPS limit: https://stackoverflow.com/questions/...lops-per-cycle |
|
November 15, 2021, 06:14 |
|
#7 |
New Member
Join Date: Aug 2021
Posts: 22
Rep Power: 5 |
There is some kind of simulations which Abaqus will not do and display warning in which would be written that it is necessary to do simulation using double precision only so, I consider what would be more effective, only good CPU +128gb example epyc 7313p or worse CPU like xeon or core i9 12900k and good GPU like quadro GP100 or Tesla P100 this GPU has theoretical double precision performance 4,7 Teraflops, so I need to know what theoretical double precision perfomance of processor is .
|
|
November 15, 2021, 07:33 |
|
#8 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49 |
Quote:
The recommendation for Abaqus -not just from me, but also from their support- is using more lower end GPUs instead of a single high-end one. Whether GPU acceleration works in your case, I don't know. I'm out of things to add to this conversation. Everything I know is already here |
||
November 16, 2021, 04:17 |
|
#9 | |
Member
EM
Join Date: Sep 2019
Posts: 59
Rep Power: 7 |
Quote:
the absolute top in cpu fp64 is attained by xeons with 2 avx512 units and mkl. == Last edited by gnwt4a; November 16, 2021 at 08:55. |
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
CFD by anderson, chp 10.... supersonic flow over flat plate | varunjain89 | Main CFD Forum | 18 | May 11, 2018 07:31 |
Star cd es-ice solver error | ernarasimman | STAR-CD | 2 | September 12, 2014 00:01 |
Missing math.h header | Travis | FLUENT | 4 | January 15, 2009 11:48 |
what's wrong about my code for 2d burgers equation | morxio | Main CFD Forum | 3 | April 27, 2007 10:38 |
REAL GAS UDF | brian | FLUENT | 6 | September 11, 2006 08:23 |