|
[Sponsors] |
May 13, 2014, 15:05 |
Single Core vs. Multi Core Issue
|
#1 |
Member
Join Date: Mar 2014
Posts: 39
Rep Power: 12 |
Hey cfd-online community,
i have an issue regarding the computational time difference between single and multi core simulations (4 cores). In general multi core processing (of course) needs less time per iteration than a single core process. But now i´m examining a case where my computer runs nearly out of memory - my RAM usage is nearly at 100% and the computational time for a simulation is determined by the time used for reading, writing and transferring data and not by the amount of cpu cores (cpu usage very low). I noticed, that in this case the time per iteration on a single core environment is about half the time than on a multi core environment. Do you have an idea why this happens or can someone explain this ? Regards Traction |
|
May 14, 2014, 14:32 |
|
#2 |
Member
vlg
Join Date: Jul 2011
Location: My home :)
Posts: 81
Rep Power: 18 |
Don't clearly understand you situation - is it a personal computer or a cluster? If personal, only "1" and "4" applies. "4" is the most probable in your case.
1) If you often read/write different case/data (not continuing one calculation) - the problem appears because the solution is partitioned to cores (time expences) and then gathered from parts (time expences). Mesh partitioning is such kind of operation when 1 core (host, main, head, master process...) divides the work with algorithmic balancing between many. Then, MPI is used, and these parts are sending through network interface to other cores. Then some kind of MPI receiving function is done by main process (gathering). Solution: small meshes don't need partitioning/parallelization by domain decomposition (the method widely used in mesh solvers). 2) The other issue could be the interconnect throughput/latency. Relatively low speed + large network traffic generated by FLUENT (small parts of work on each core - iteration finish very quickly) => bad performance. You could even get worse performance that on single core. Solution: choose proper interconnect. InfiniBand is supported by FLUENT and is very fast - use it instead ethernet, if you have it. Code:
-pinfiniband See also solution for "1" (for personal computer - only that solution applies in "2"). 3) (for clusters only) The third thing to mention is your data storage system speed. Low speed of storage system + frequent disk r/w => bad performance. Solution: use good data storage system. 4) If you are out of RAM, then your calculations proceed partially in swap that is hard disk drive space. When you use single core, single data stream is written on the hdd, when you use four - four data streams are written simultaneously. But your hdd couldn't write/read 4 streams simultaneously (assuming you don't have parallel r/w storage system), cylinder heads will go back and forth writing/reading pieces of data. So you wouldn't overcome hdd speed in that case + mind partitioning issues from "1" - you would ever slower your solution by adding another core. One could correct me, if I'm somewhere wrong. Solution: increase RAM. Use distributed memory systems (clusters, supercomputers). |
|
May 14, 2014, 14:52 |
|
#3 |
Member
Join Date: Mar 2014
Posts: 39
Rep Power: 12 |
I think point 4 is the main problem of my calculation.
With the help of your explanation i start to understand fluent and the connection to required hardware better. Thank you very much !!! |
|
Tags |
multi, ram, single, time |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
multi fluid mixture model issue | rystokes | CFX | 3 | August 9, 2009 20:13 |
4 (one core) or 2 (double core) processors? | Gen | CFX | 9 | February 21, 2008 06:48 |
OpenMP,MPI and dual core | Quarkz | Main CFD Forum | 2 | December 7, 2007 20:29 |
Questions about CPU's: quad core, dual core, etc. | Tim | FLUENT | 0 | February 26, 2007 15:02 |
Lead Core dive curve explanation? | Gus Pratte | Main CFD Forum | 0 | August 27, 2006 18:44 |