|
[Sponsors] |
August 27, 2018, 12:05 |
|
#41 |
Member
Join Date: Dec 2016
Posts: 44
Rep Power: 9 |
Not only that, the solution process will probably break off, because of an error.
http://www.cadfem.de/fileadmin/CADFE...CADFEM_GPU.pdf |
|
August 27, 2018, 12:14 |
|
#42 |
Member
Join Date: Jun 2010
Posts: 77
Rep Power: 16 |
Can i use a Quadro K6000 plus a Tesla K80? Will they work together?
|
|
August 27, 2018, 12:32 |
|
#43 |
Senior Member
Micael
Join Date: Mar 2009
Location: Canada
Posts: 157
Rep Power: 18 |
Flow Setup:
as OP Software/Hardware: Operating system: CentOS Linux 7 Fluent version: Ansys Fluent 19.1 CPU: Dual Xeon Gold 6150, HT disabled Memory: 192 GB DDR4-2666 ECC (12 dimm x 16 GB) GPU: 4 x V100-32GB NVLINK Did only Second Precision. 32-core Simple None-GPU: 4.1 s Simple 1-GPU: 4.1 s Simple 4-GPU: 4.1 s Coupled None-GPU: 13.2 s Coupled 1-GPU: 28.1 s Coupled 2-GPU: 24.2 s Coupled 4-GPU: 22.2 s 4-core Simple None-GPU: 22.6 s Coupled None-GPU: 67.5 s Coupled 1-GPU: 48.4 s Coupled 2-GPU: 44.7 s Coupled 4-GPU: 42.1 s 1-core Simple None-GPU: 89.0 s Coupled None-GPU: 273.5 s Coupled 1-GPU: 132.1 s |
|
August 27, 2018, 13:16 |
|
#44 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Great, finally some decent hardware. Would you mind running a larger case (coupled+DP would be enough)? I only chose such a small one due to the lack of VRAM on the GPUs I had available at the time. Would be interesting to see if you can get some GPU scaling going while running 32 CPU cores.
|
|
August 27, 2018, 14:13 |
|
#45 |
Member
Join Date: Dec 2016
Posts: 44
Rep Power: 9 |
||
August 29, 2018, 12:58 |
|
#46 |
Senior Member
Micael
Join Date: Mar 2009
Location: Canada
Posts: 157
Rep Power: 18 |
Flow Setup:
as OP excepted mesh is 215 x 215 x 215 (10M cells) Software/Hardware: Operating system: CentOS Linux 7 Fluent version: Ansys Fluent 19.1 CPU: Dual Xeon Gold 6150, HT disabled Memory: 192 GB DDR4-2666 ECC (12 dimm x 16 GB) GPU: 4 x V100-32GB NVLINK Did only Second Precision. 32-core Coupled None-GPU: 577 s Coupled 1-GPU: Failed, apparently out of memory Coupled 2-GPU: 541 s Coupled 4-GPU: 394 s |
|
April 12, 2019, 19:02 |
I am using TITAN V and I Cannot load this GPU with my simulation
|
#47 |
New Member
MIguel Rodriguez
Join Date: Jan 2017
Posts: 1
Rep Power: 0 |
I bough a Titan V. I did not know that this card does not work with ansys fluent. Someone ask you about how to resolve this problem? Is it possible load this GPU with a simulation? I followed every step to activating GPU, but it does not work. Help me pleased.
|
|
April 12, 2019, 20:15 |
|
#48 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Seems like this topic comes around again every now and then...
If the solver you are using is not utilizing the GPU despite it being activated in the Fluent launcher, then you won't see much benefit from the GPU anyway. You could force Fluent to use the GPU with some TUI commands, but again, expect to see no improvement or even worse performance with a GPU enabled in these cases. https://www.sharcnet.ca/Software/Ans...-EC933A7E.html Or maybe Ansys decided to use a whitelist for GPUs in Fluent just like they did with some of their other software. It's been a while since I last used it. |
|
June 1, 2020, 17:33 |
|
#49 | |
New Member
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 6 |
Quote:
I'm very eager to see the result of your tests with Tesla V100. Before reading your posts, I was going to combine Threadripper 3970x with Quadro RTX 4000, but now that GPU acceleration is not as effective/justifiable as advertised, what alternative do you suggest for Quadro RTX 4000? |
||
June 1, 2020, 18:59 |
|
#50 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
It should come as no surprise that I never got any samples. I was not really expecting that.
I don't have any alternative in the price range of an Quadro RTX 4000 card. Well Nvidia doesn't, but that is splitting hairs. GPU acceleration with Ansys products is for people with a virtually unlimited hardware budget, due to the fact that software, engineers and development time are so much more expensive than a workstation. My advice to everyone else is to focus on CPU performance first. If you really want to do GPU acceleration on a budget, try used Quadro K6000 cards. They can be found for around 300-400$. That is of course if you want to do double precision. With single precision, any semi-recent CUDA capable card should do. The consumer cards offer much better value than the Quadro and Tesla lineup here. |
|
June 2, 2020, 01:38 |
|
#51 |
New Member
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 6 |
Thanks for the quick response, it was a relief after days of research
Best |
|
June 2, 2020, 02:34 |
|
#52 |
New Member
sida
Join Date: Dec 2019
Posts: 6
Rep Power: 6 |
Also, this article, using Openfoam, can help us understand that an investment in CPU is much more reliable compared to an investment in GPGPU, at least when it comes to cfd.
Multi GPU Implementation to Accelerate the CFD Simulation of a 3D Turbo-Machinery Benchmark Using the RapidCFD Library https://link.springer.com/chapter/10...030-38043-4_15 |
|
February 27, 2021, 12:28 |
|
#53 |
New Member
Bhanuday Sharma
Join Date: Jun 2015
Posts: 18
Rep Power: 11 |
It would have been helpful if you could uploaded your .cas / mesh file. So, that other users quickly test their system configuration.
|
|
April 28, 2023, 10:08 |
Summa summarum
|
#54 | |
Member
Join Date: Aug 2012
Location: Italy
Posts: 68
Rep Power: 14 |
Quote:
In case of medium parallelization (max 64/128 cores), GPGPU can be convenient only if: 1) you're using coupled algorithms; AND 2) you're using powerful graphic cards (Quadro5000 or above). Given all this, GPU RAM must be big enough to contain the mesh of the problem you're going to study. This means that, if your average mesh requires around 64 GB of RAM (I understand that it can sound quite small for some of you guys, but for those who don't work at NASA it's pretty much!), and you have planned to adopt Quadro RTX 5000s (16GB each), you should have a quad SLI or more... Given all this, it's absolutely impossible for a "normal" user to make use of GPGPU technology in CFD. Many thanks, C. |
||
April 28, 2023, 11:27 |
|
#55 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Some things changed since I originally posted my little experiment.
For example, Ansys now has a native GPU solver, which allegedly runs much faster. I can not comment on that claim. Some things didn't change though, at least not for the better. Commercial GPU solvers -native or otherwise- don't have feature parity with the established CPU counterparts. If you have everything you need, or are willing to change your workflow to accommodate the missing features, maybe it is for you. GPU memory is still a scarce resource. Nvidia now sells cards with 80GB of VRAM (e.g. H100), but of course these are at the high end for data centers. And noteworthy FP64 performance is reserved for very few products at the high end. Everything else is cut down to a 1:32 divider for FP64. My main motivation for writing this article in the first place was this: people here regularly inquired "which graphics card should I buy to get good acceleration in my new Fluent workstation. My total budget is -insert figure below 10000€-" For the vast majority of cases, the answer is just stick to maximizing CPU performance. GPU acceleration or computation with commercial CFD solvers is for data centers. The hardware is just too expensive to make it work in any other setting. This is a trend that has only accelerated over the last few years. Or to put it very bluntly: if you have to ask me -a random stranger on the internet- for advice, you probably should not bother with GPUs Of course, if you like to tinker with used hardware, don't let me stop you. P100 go for less than 300€ on ebay these days |
|
May 11, 2023, 09:31 |
|
#56 | |
Senior Member
Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 1,290
Rep Power: 34 |
Quote:
I understand this perception comes from presentations of Ansys and Siemens where they are showing results from top of the line GPUs that a normal person would not have on desktop. So yaa if the user is only confining itself to these limited few names then what you said is very true. But here with Wildkatze i try to focus on what layman could have on the desktop and what we can gain out of it. Even with my old 2080ti I am able to gain almost 25 to 30x of speed up. Now if i have to pick current GPU like 40XX series this scaling would go long way and this a normal user can afford. The only problem that i could see is that people still won't use the solver because they do not know the name (people use what they know of and don't want to try anything new). But if they want it then they can get good speed up from GPU here. |
||
May 11, 2023, 12:21 |
|
#57 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
I find it very commendable that you put in the effort to make it work with hardware we can actually get our hands on.
But a 25x speedup from a 2080TI compared to CPU begs the question: what CPU are you comparing to? And are we talking multi-threaded or single-threaded. Please don't take this the wrong way, but such outrageously high speedups from GPU acceleration, when comparing to a reasonably modern CPU, usually get you a few raised eyebrows in the HPC community. Because it usually means that the CPU implementation simply does not have the same level of optimization as the GPU implementation. When looking at the raw specs like theoretical FP32 operations per second, or memory bandwidth, there is not a 25x gap between CPUs and GPUs. At least leaving aside hardware accelerated operations. |
|
May 11, 2023, 12:28 |
Rtx a6000
|
#58 |
New Member
Join Date: Jun 2018
Posts: 7
Rep Power: 8 |
I have a Rtxa6000 that I barely used for sale at 4500
What you need is vram size so the cad mesh don’t have to be continuously broken up and sent back and forth from the ssd to the cpu to the gpu |
|
May 11, 2023, 13:29 |
|
#59 | |
Senior Member
Arjun
Join Date: Mar 2009
Location: Nurenberg, Germany
Posts: 1,290
Rep Power: 34 |
Quote:
I was very casual so did not think of writing cpu etc. The CPU here is AMD 2990WX 32 Core. This i am sure you know of. The machine has 128GB RAM but the whole thing was run on GPU with double precision. The idea that i am working on is to make a gpu engine where people should be able to run the case from different solvers. At the moment it runs from two software i have. If someone help me with openfoam loader (a translater) it shall be able to run those cases too. Thats the idea so far. Edited to add: So far in our testing we are within 5% of starccm as cost per iteration. So this shall give some idea about the implementation (we usually take less iterations to converge compare to starccm with fluent never could compare). |
||
May 11, 2023, 15:20 |
|
#60 |
Senior Member
Joern Beilke
Join Date: Mar 2009
Location: Dresden
Posts: 539
Rep Power: 20 |
I'm not sure about the right terminology in GPU computing but Arjuns implementation comes without domain decomposition. So just one single processor-domain. This seems to be a good way to accelerate cases, which have a limited potential for a parallel speedup.
I have seen the MotorBike benchmark running on his machine with a preliminary version of the code and it was more than impressive. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Resolved] GPU on Fluent | Daveo643 | FLUENT | 4 | March 7, 2018 09:02 |
How to open Icem mesh in Ansys Fluent? | emmkell | FLUENT | 27 | February 6, 2018 04:34 |
Can you help me with a problem in ansys static structural solver? | sourabh.porwal | Structural Mechanics | 0 | March 27, 2016 18:07 |
Running UDF with Supercomputer | roi247 | FLUENT | 4 | October 15, 2015 14:41 |
Ansys structural and fluent for FSI | assafwei | FLUENT | 1 | June 20, 2014 11:56 |