|
[Sponsors] |
October 21, 2015, 17:11 |
Foam-extend 3.1 cuda solver solve time
|
#1 |
New Member
Paul Handy
Join Date: Sep 2014
Location: Idaho, USA
Posts: 21
Rep Power: 12 |
I've just compiled the cuda solver for foam-extend 3.1. I'm using a Quadro k4000. At much length, I was able to successfully run a few cases with the cuda solver, but it has been an order of magnitude slower that a single core cpu. Perhaps it is only because this requires more relaxation iterations, but I've not been able to get anything faster out of it. I've attached the simple icoFoam cavity case that I modified to run on GPU, in hopes that someone can help me figure out how to get some use out of the cuda solver.
Cheers. |
|
October 21, 2015, 21:14 |
|
#2 |
Senior Member
Daniel P. Combest
Join Date: Mar 2009
Location: St. Louis, USA
Posts: 621
Rep Power: 0 |
Paul,
If you are really running on this case, the mesh is really small....too small to see any benefit. Now, if you really bumped up the cell count, then you will start to see some action. I would also play with the preconditioners once you start getting a sizable mesh. Lastly, when you get up in cell-counts then transient cases tend to not be a good choice on GPUs using this method because it will spend a lot of time moving data around. |
|
October 22, 2015, 10:10 |
|
#3 |
New Member
Paul Handy
Join Date: Sep 2014
Location: Idaho, USA
Posts: 21
Rep Power: 12 |
Thanks for the reply, Dan. I have some cases that I want to run in the future which have much higher cell-counts than the example provided. So GPU solving doesn't work as well on transient cases as it does on steady-state? That's news to me.
|
|
October 22, 2015, 10:31 |
|
#4 | |
Senior Member
Daniel P. Combest
Join Date: Mar 2009
Location: St. Louis, USA
Posts: 621
Rep Power: 0 |
Quote:
Its not that they don't work well, its the nature of the acceleration. The GPU only solves the inner iterations i.e. the Ax=b system. FOAM will build the coefficient matrix, along with x and b and then throw that over to the GPU. If this is a rather large amount of data then this will be the bottleneck of the operation. This process of building Ax=b, moving to the GPU, solving, passing x back....rinse and repeat can be done efficiently but you need to minimize the bottlenecks. So, if you are doing many inner iterations then it is a great tool. If you are doing a few inner iterations and then taking many outer iterations (time steps or solver PIMPLE/PISO/SIMPLE iterations) then it will be slow. So, ideally, this is great for problems were very deep convergence of inner iterations are a must. Now, there is an effort to offload the entire PIMPLE/PISO/SIMPLE algorithm onto the GPU to reduce this bottleneck and it has shown some promise. But at present, the hybrid computing approach currently in use with FOAM has its issues. If you are a developer....Im happy to revive it and do some more development work. I haven't touched this since Grad school (cufflink project) and another user moved this into foam-extend.....but we can definitely revive this. I just moved it to github about 5 minutes ago. |
||
Tags |
cuda, solve time |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[OpenFOAM] Take derivative of mean velocity in paraFoam | hiuluom | ParaView | 13 | April 26, 2016 07:44 |
dynamic Mesh is faster than MRF???? | sharonyue | OpenFOAM Running, Solving & CFD | 14 | August 26, 2013 08:47 |
Upgraded from Karmic Koala 9.10 to Lucid Lynx10.04.3 | bookie56 | OpenFOAM Installation | 8 | August 13, 2011 05:03 |
[blockMesh] BlockMesh FOAM warning | gaottino | OpenFOAM Meshing & Mesh Conversion | 7 | July 19, 2010 15:11 |
[blockMesh] Axisymmetrical mesh | Rasmus Gjesing (Gjesing) | OpenFOAM Meshing & Mesh Conversion | 10 | April 2, 2007 15:00 |