CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Double precision cpu

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   October 3, 2021, 06:51
Default Double precision cpu
  #1
New Member
 
Join Date: Aug 2021
Posts: 22
Rep Power: 5
Cinek_Poland is on a distinguished road
When I clicked a button to turn on simulation in Abaqus, I watched warning in which software recommended running simulation in double precision FP64 because software need more than 20 milions iteration to do simulation.
So I would like to ask which for example 4-5 cpu should have top performance of double precision FP64 ?
Cinek_Poland is offline   Reply With Quote

Old   October 3, 2021, 07:37
Default
  #2
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
CPUs aren't GPUs
That is to say, the CPU market is not segmented into product lines with particularly high or low FP64 performance.
Assuming the code is not perfectly vectorized (which most software isn't) the main difference between running FP64 vs FP32 is an increase in memory bandwidth requirement. So you can get a slight increase in performance by making sure you get a CPU with a lot of memory channels. But the performance difference will be rather small compared to the factors of 16, 32 or even 64 we see in GPU floating point performance.
flotus1 is offline   Reply With Quote

Old   October 3, 2021, 08:46
Default
  #3
New Member
 
Join Date: Aug 2021
Posts: 22
Rep Power: 5
Cinek_Poland is on a distinguished road
"That is to say, the CPU market is not segmented into product lines with particularly high or low FP64 performance"
Yes but for example low end cpu's will have less FP64 performance than
high end . I need to know which cpu's have the highest FP64 performance ,
for example epyc milan like 7313p or Xeon Silver ?
I need to know 4-5 cpu's with highest FP64 performance and they will be tested by shop in which I will buy CPU and motherboard.'
If for example simulation in Abaqus will start only if I use FP64 , I will have to use FP64 .
I dont want use FP64 but in some case I will have to so I should buy cpu with the best FP64 performance. Do you think epyc 7313p will has better fp6 performance than Xeon Silver 4316 ?
Cinek_Poland is offline   Reply With Quote

Old   October 3, 2021, 12:47
Default
  #4
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Let me reiterate that: "FP64" isn't anything special for a CPU. Every time I or someone else here talks about performance of CPUs in general terms, that also includes floating point calculations with 64-Bit variables. Matter of fact, when running a CFD solver in double precision, memory bandwidth limitations may become more pronounced.
So you don't need any special advice for a system that can handle FP64 particularly well. All the information is already there.

Abaqus itself is a different story entirely. Their "standard" and "explicit" solvers behave quite differently. Explicit is more akin to a well-behaved CFD solver. Scales nicely even on distributed parallel systems.
Standard is...different. I pieced together that they have something along the lines of a hybrid OpenMP-MPI parallelization going on. Support wasn't able to tell me what exactly it is, or how to control stuff like core binding and mapping. Without any additional arguments, you only get the OpenMP type of parallelization, and it doesn't scale as nicely as explicit. The only upside of this solver -from the perspective of someone who doesn't use it- is that GPU acceleration works exceptionally well.

I urge you to take a step back before you start throwing money at the problem. For example, standard and explicit solvers both have their uses for specific types of problems. E.g. explicit can be better for dynamic problems like crash simulation.
If the solver decides that 20 million iterations are necessary, maybe you are using the wrong solver. Or there might be another solver setting that needs to be addressed:
https://info.simuleon.com/blog/7-tip...qus-run-faster
I can't tell you which one, you would need an expert in Abaqus for that. All I know is that I wold carefully re-evaluate my modelling approach if it turned out that my CFD simulation requires 20 million iterations. Because I know that no amount of money spent on hardware will get me through that.
flotus1 is offline   Reply With Quote

Old   November 14, 2021, 09:50
Default
  #5
New Member
 
Join Date: Aug 2021
Posts: 22
Rep Power: 5
Cinek_Poland is on a distinguished road
So you mean processos have the same performance using single and double precision ?
For example if processor has 76 tflops in single precision so in double precision also processor will has 76 tflops ?
Cinek_Poland is offline   Reply With Quote

Old   November 15, 2021, 03:33
Default
  #6
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Not necessarily the same as in identical FLOPS for FP32 and FP64. That's a rather complicated topic, and I can't say that I understand every minute detail. EUs can be multi-purpose, doing either one FP64 calculation, or two FP32 calculations in the same amount of time.
That means worst-case is 2:1 FLOPS for FP32 vs. FP64. But most CPUs these days are very similar in that regard, with none being particularly well-suited for either type of calculation.
That's all really theoretical anyway, since real software won't run anywhere near the limit for floating point calculations of a CPU. Especially for parallel CFD and FEM, other bottlenecks are hit before that, due to low computational intensity, and data access which is never fully predictable by the prefetcher. The cache and memory subsystem are limiting factors, preventing the FPUs from operating at their maximum theoretical throughput.

See e.g. here what it takes to get a CPU to operate near its theoretical FLOPS limit: https://stackoverflow.com/questions/...lops-per-cycle
flotus1 is offline   Reply With Quote

Old   November 15, 2021, 06:14
Default
  #7
New Member
 
Join Date: Aug 2021
Posts: 22
Rep Power: 5
Cinek_Poland is on a distinguished road
There is some kind of simulations which Abaqus will not do and display warning in which would be written that it is necessary to do simulation using double precision only so, I consider what would be more effective, only good CPU +128gb example epyc 7313p or worse CPU like xeon or core i9 12900k and good GPU like quadro GP100 or Tesla P100 this GPU has theoretical double precision performance 4,7 Teraflops, so I need to know what theoretical double precision perfomance of processor is .
Cinek_Poland is offline   Reply With Quote

Old   November 15, 2021, 07:33
Default
  #8
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,426
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Quote:
GPU has theoretical double precision performance 4,7 Teraflops, so I need to know what theoretical double precision perfomance of processor is .
No, you really don't need to know that. Peak FLOPS for GPU acceleration is even more irrelevant than for CPUs.
The recommendation for Abaqus -not just from me, but also from their support- is using more lower end GPUs instead of a single high-end one. Whether GPU acceleration works in your case, I don't know.
I'm out of things to add to this conversation. Everything I know is already here
flotus1 is offline   Reply With Quote

Old   November 16, 2021, 04:17
Default
  #9
Member
 
EM
Join Date: Sep 2019
Posts: 59
Rep Power: 7
gnwt4a is on a distinguished road
Quote:
Originally Posted by Cinek_Poland View Post
When I clicked a button to turn on simulation in Abaqus, I watched warning in which software recommended running simulation in double precision FP64 because software need more than 20 milions iteration to do simulation.
So I would like to ask which for example 4-5 cpu should have top performance of double precision FP64 ?



the absolute top in cpu fp64 is attained by xeons with 2 avx512 units and mkl.
==

Last edited by gnwt4a; November 16, 2021 at 08:55.
gnwt4a is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
CFD by anderson, chp 10.... supersonic flow over flat plate varunjain89 Main CFD Forum 18 May 11, 2018 07:31
Star cd es-ice solver error ernarasimman STAR-CD 2 September 12, 2014 00:01
Missing math.h header Travis FLUENT 4 January 15, 2009 11:48
what's wrong about my code for 2d burgers equation morxio Main CFD Forum 3 April 27, 2007 10:38
REAL GAS UDF brian FLUENT 6 September 11, 2006 08:23


All times are GMT -4. The time now is 21:22.