|
[Sponsors] |
November 13, 2010, 15:02 |
OpenFOAM goes multi-gpu
|
#1 |
Member
|
Dear All,
We are about to release a new version of OpenFOAM plugin aimed at acceleration of OF simulations on multi-gpu systems (also with single gpu). I was wondering if OpenFOAM users have some special feature requests. FYI, the plugin has the following functionality: - It is a plugin and does not require to recompile OF. You just change one line in the configuration file and gpu-based solvers are used, if available. - Currently, we also provide CG in single precision at no cost to demonstrate its usefulness but the plugin works with any gpu-based solver that produces matrices in CSR format. - Note: We charge for CG/BCGSTAB in double precision and support. Installation of the free OF plugin will be quite simple. - It will discover how many gpu cards the system has and will submit more intensive jobs to better cards. - The plugin will be available at speedit.vratis.com - So far we tested it with our solvers and we observed acceleration from several times on standard setup : GTX285+GTX460 to x38 on 4xTesla machines depending on the problem. At speedit.vratis.com there is a forum where we have opened the discussion on feature requests. I am looking forward to meeting you there. |
|
November 13, 2010, 21:21 |
|
#2 |
Senior Member
Travis Carrigan
Join Date: Jul 2010
Location: Arlington, TX
Posts: 161
Rep Power: 16 |
Excellent!
|
|
November 13, 2010, 23:45 |
|
#3 |
Member
Masashi Ohbuchi
Join Date: Oct 2009
Posts: 74
Rep Power: 17 |
Wonderful!
Is it SpeedIT eXtreme 1.2? When do you release it? |
|
November 14, 2010, 23:55 |
a few questions
|
#4 | |
Senior Member
Daniel P. Combest
Join Date: Mar 2009
Location: St. Louis, USA
Posts: 621
Rep Power: 0 |
Quote:
1. When you say 38x speedup are we talking about the linear system solvers or the whole solver (say simpleFoam)? If we are talking about the whole solver (simpleFoam again) then the 38X speedup sounds a little high since there are lots of other processes besides the linear system solver that takes a lot of time to complete. If we are talking about the linear system solver itself then that is a little more believable. Also...how does the code compare to more CPUs? What CPUs are we comparing to...P4s or Quad Core Xeons? 2. When you compare the speedup are you comparing single precision GPU to single precision OF or single precision GPU to double precision OF? 3. Have you ensured that you are comparing convergence the same way that OF is computing with a scaled residual? If not then your speedup calculations are not entirely accurate. 4. Parallel: Are you letting OF do the domain decomposition and then passing the decomposed domains to the GPU and communicate between each other or are you only solving the AX=b system on several GPUs through a parallel Krylov subspace CG (or alike) solver? If this is the later, then I would request that the code use the decomposed domains from decomposePar and work that way. This method should be faster on really large problems since there are other processes that can be done by the separate CPUs and the linear system solvers on the GPUs. I guess Im asking for parallel CPU and parallel GPU together. 5. What kind of preconditioners are available? Multigrid would be excellent, or even some sort of sparse approximate inverse (without fill in to prevent forming a dense matrix). 6. Wouldn't speedup depend tremendously on the memory bandwidth of host to device (main memory on the mother board to the GPU device memory)? If so then what type of mother board is being used for the comparison and does anyone know if the results can be scaled in a way that they can be compared to other setups? 7. Lastly, since the CUDA code is provided with the plugin, couldn't one just change a float to a double in the proper place? Again...great work! I like this type of work and just had a few questions. Dan |
||
November 15, 2010, 04:48 |
|
#5 |
Member
|
That would be OF plugin 1.1 We should release it this week.
SpeedIT eXtreme 1.2 will be released probably this year with: - support for complex operations - maybe also with new SpMV kernels (that are faster that those in cusparse). |
|
November 15, 2010, 05:12 |
|
#6 |
Member
|
Dear Dan. Let me answer your questions:
1. When you say 38x speedup are we talking about the linear system solvers or the whole solver (say simpleFoam)? If we are talking about the whole solver (simpleFoam again) then the 38X speedup sounds a little high since there are lots of other processes besides the linear system solver that takes a lot of time to complete. If we are talking about the linear system solver itself then that is a little more believable. Also...how does the code compare to more CPUs? What CPUs are we comparing to...P4s or Quad Core Xeons? It was a large simpleFoam job: 1 GPU vs. 4 GPU the speedup is x6. 4 GPU vs. 12CPU (Opteron 2.2GHZ) using diagonal preconditioners for both cases, the speedup is x38. 2. When you compare the speedup are you comparing single precision GPU to single precision OF or single precision GPU to double precision OF? single precision vs. single precision, and double precision vs. double precision. 3. Have you ensured that you are comparing convergence the same way that OF is computing with a scaled residual? If not then your speedup calculations are not entirely accurate. Yes. We checked also residuals. Send me a private message and I could send you the logs from the computations. 4. Parallel: Are you letting OF do the domain decomposition and then passing the decomposed domains to the GPU and communicate between each other or are you only solving the AX=b system on several GPUs through a parallel Krylov subspace CG (or alike) solver? Exactly. We take advantage of OF domain decomposition. If this is the later, then I would request that the code use the decomposed domains from decomposePar and work that way. This method should be faster on really large problems since there are other processes that can be done by the separate CPUs and the linear system solvers on the GPUs. I guess Im asking for parallel CPU and parallel GPU together. So far we replaced solvers with their gpu-versions for simpleFOAM and pisoFOAM cases and observed acceleration. We did not used parallel CPU and parallel GPU together. 5. What kind of preconditioners are available? Multigrid would be excellent, or even some sort of sparse approximate inverse (without fill in to prevent forming a dense matrix). We are working on GAMG but I cannot estimate the time when we finish yet. 6. Wouldn't speedup depend tremendously on the memory bandwidth of host to device (main memory on the mother board to the GPU device memory)? If so then what type of mother board is being used for the comparison and does anyone know if the results can be scaled in a way that they can be compared to other setups? We noticed that Intel i7 processors provide higher memory bandwith. Older types perform worse. Of course if the number of iterations is low then our library is of no use because of memory transfers. This is why we are working on porting the whole piso solver to gpu. 7. Lastly, since the CUDA code is provided with the plugin, couldn't one just change a float to a double in the proper place? There are double kernels but in the eXtreme version of SpeedIT. Classic version supports only float. Again...great work! I like this type of work and just had a few questions. Thanks Best wishes, Lukasz |
|
November 15, 2010, 16:48 |
great...I'll check it out as soon as it is available.
|
#7 |
Senior Member
Daniel P. Combest
Join Date: Mar 2009
Location: St. Louis, USA
Posts: 621
Rep Power: 0 |
great...I'll check it out as soon as it is available. Thanks for answering my questions.
Dan |
|
November 15, 2010, 17:13 |
|
#8 |
New Member
Zhijun
Join Date: Nov 2010
Posts: 3
Rep Power: 16 |
dear lukasz ,this looks too good to be true .please allow one question about licences .you list the classic version correctly as gpl .but the academic and commercial versions cannot be used as openfoam plugins .as you know, openfoam is gpl .a plugin interface must be derived from openfoam header files .for this reason every plugin is also gpl .please explain your license model .we would like to purchase at least the evaluation version ,but if the legal foundation is unclear ,we have to order cuda solvers elsewhere .
thanks zhijun ! |
|
November 15, 2010, 23:38 |
|
#9 |
Member
|
Dear Zhijun, you are of course correct. All the work derived from OF is GPL, meaning OF plugin and SpeedIT Classic are GPL-based. SpeedIT extreme is a separate development and it has no dependencies, bindings and relations to OF, except it supports CSR format which is in a public domain. You can use it in your own code.
Let me know if I answered your question. |
|
November 16, 2010, 22:37 |
|
#10 | |
Senior Member
Martin Beaudoin
Join Date: Mar 2009
Posts: 332
Rep Power: 22 |
So basically, this means that the only product we can use with OpenFOAM is your product called SpeedIT Classic, who is a limited, single precision, but GPL licensed package; is that right?
Martin Quote:
|
||
November 17, 2010, 15:08 |
|
#11 |
Member
|
It is like using Matlab with OF. Matlab supports CSR as well. Anyway, thank you for the fruitful discussion. We decided to put a special explanation on our web page for the OF users that explains licensing terms in more details.
|
|
November 17, 2010, 22:13 |
|
#12 |
Senior Member
Martin Beaudoin
Join Date: Mar 2009
Posts: 332
Rep Power: 22 |
Sorry, still cannot find that information on your Web site.
Why don't you put that information plainly on this OpenFOAM Message Board? Martin |
|
November 18, 2010, 12:12 |
OF Version
|
#13 |
New Member
Josiah Xu
Join Date: Jan 2010
Posts: 8
Rep Power: 16 |
Does the OpenFoamplugin work with fixed OF version or most versions. I am using OF 1.5-dev and 1.6.
|
|
November 18, 2010, 15:18 |
|
#14 |
Member
|
Dear Josiah Xu. It has been tested on OF 1.6 and 1.7.
|
|
November 19, 2010, 14:53 |
|
#15 |
Member
|
Dear All,
We are happy to announce a new release of the OpenFOAM plugin 1.1 (GPL License). Here is the list of features: -Multi-GPU support. -Tested on Fermi architecture (GTX460 and Tesla C2050). -Automated submission of the domain to the GPU cards (using decomposePar from OpenFOAM). -Optimized submission of computational tasks to the best GPU card in the system for any number of computational threads. -Plugin picks the most powerful GPU card for a single thread cases. You can freely download it at speedit.vratis.com. Enjoy! |
|
November 26, 2010, 22:13 |
|
#16 |
Member
|
Complex matrices will be soon supported as well.
For those who asked me about that. Best, Lukasz |
|
December 8, 2010, 15:15 |
|
#17 |
Member
|
BTW, does anybody is aware of OpenCL based solvers ?
|
|
February 25, 2011, 18:26 |
|
#18 |
Member
Join Date: Nov 2009
Posts: 48
Rep Power: 17 |
||
February 27, 2011, 09:41 |
|
#19 |
New Member
Jakub Pola
Join Date: Feb 2011
Posts: 22
Rep Power: 15 |
Hello,
To see if you can use Speedit plugin for interFoam or interDyMFoam please check system/fvSolution file and see if case is using CG or BICG solvers. If yes you can substitute them with CG_accel and BiCG_accel respectively. Remeber that in Speedit Classic version only CG_accel is available. I hope that I helped. Best regards. Kuba. |
|
March 12, 2011, 12:25 |
|
#20 |
Member
Join Date: Nov 2009
Posts: 48
Rep Power: 17 |
Hello Kuba,
Thanks for your answer. I checked my fvSolution. The solver for Pcorr & P_rgh are PCG and for U is PBicG. I guess it is not compatible with this Plugin. Am i right? Mehran boy;297175]Hello, To see if you can use Speedit plugin for interFoam or interDyMFoam please check system/fvSolution file and see if case is using CG or BICG solvers. If yes you can substitute them with CG_accel and BiCG_accel respectively. Remeber that in Speedit Classic version only CG_accel is available. I hope that I helped. Best regards. Kuba.[/QUOTE] |
|
Tags |
cuda, gpu, openfoam |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Cross-compiling OpenFOAM 1.7.0 on Linux for Windows 32 and 64bits with Mingw-w64 | wyldckat | OpenFOAM Announcements from Other Sources | 3 | September 8, 2010 07:25 |
GAMG on GPU for OpenFoam | ziemowitzima | OpenFOAM Programming & Development | 1 | April 26, 2010 19:02 |
Modified OpenFOAM Forum Structure and New Mailing-List | pete | Site News & Announcements | 0 | June 29, 2009 06:56 |
64bitrhel5 OF installation instructions | mirko | OpenFOAM Installation | 2 | August 12, 2008 19:07 |
Adventure of fisrst openfoam installation on Ubuntu 710 | jussi | OpenFOAM Installation | 0 | April 24, 2008 15:25 |