|
[Sponsors] |
October 14, 2013, 10:55 |
Optimal memory configuration
|
#1 |
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
It is my impression that many CFD hardware solutions are memory bandwidth limited. Thus, it ought to be an important task to find the highest performing memory configuration for each workstation/cluster nodes in question.
I am in the process of specifying a 32-core mini-cluster which will be based on Xeon E5-2667 v2, which has 4 memory channels running up to 1866 MHz. Below article mentions type of DIMM (UDIMM, RDIMM and LRDIMM) and number of ranks on each DIMM as an important parameter: http://www.robotics.net/2013/08/20/o...ive-workloads/ I understand that it is vital to utilize all 4 memory channels for each CPU, as 2 channels will only have approx. half the transfer rate. But the importance of number of ranks and DIMM type is new to me. The article above basically points out the following guidelines: 1) The more ranks on each DIMM the more memory can be accessed parallel. 2) Quad ranks should be avoided as they cannot run on full speed. Thus, dual rank DIMMs gives the highest performance. 3) Thus bus supports up to 8 ranks per channel. 4) The bus supports up to 3 banks of DIMMs per channel. However, only 2 banks can be used at full speed. 5) Use UDIMMs as they are faster than RDIMMs and LRDIMMs Conclusion: Use 2 DIMMs of dual rank per memory channel – that is 8 DIMMs per CPU. Does anyone have CFD performance data comparing different memory configurations covering the number of ranks and DIMM types? Would be great to know the impact on performance by these parameters. Best regards Kim Bindesbøll |
|
October 16, 2013, 15:52 |
|
#2 |
New Member
Lee
Join Date: Jun 2012
Posts: 4
Rep Power: 14 |
Good question, I would be interested in the answer as well.
Currently I am running a cluster configuration with 6 core i7's and 4x8 Gbs of RAM. Perhaps it would be better to run 6x6 Gbs. I haven't thought about it greatly but input is appreciated from anyone with experience. -Ross |
|
October 17, 2013, 11:08 |
|
#3 |
Member
Kim Bindesbøll Andersen
Join Date: Oct 2010
Location: Aalborg, Denmark
Posts: 39
Rep Power: 16 |
I found this very valuable document answering all my questions:
http://i.dell.com/sites/doccontent/s...ance-guide.pdf Conclusions line: 1) Populate all memory channels. Memory bandwidth is almost proportional to the ratio og populated channels (Figure 31). 2) UDIMMs has only 2% less latency than RDIMMs (Figure 10). 3) UDIMMs and RDIMMs has equal effective bandwidth (Figure 11). 4) For RDIMMs it has no impact on latency whether there is 1 or 2 DIMMs per memory channel (DPC) (Figure 36 og 37) 5) Single rank RDIMMs has 11% less bandwidth than dual rank (at 1 DPC) (Figure 37) Bottom line: Use Dual Rank UDIMMs or RDIMMs, 1-2 DIMMs in each memory channel. Best regards Kim Bindesbøll |
|
October 23, 2013, 07:42 |
|
#4 |
New Member
John McEntee
Join Date: Jun 2013
Posts: 8
Rep Power: 13 |
I have been told F1 teams will buy a xeon cluster, but only run CFD tasks using half the cores. They are limited by the computing power and wind tunnel time they are allowed to use, and get 20% better performance be core hour used (probably some bench mark they have to adhere to). The increase in memory bandwidth per core is said to be the reason for the 20% increase over a fully used cluster half the size.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Optimal configuration for running OpenFOAM | bergantz | OpenFOAM Running, Solving & CFD | 0 | October 14, 2008 20:28 |
Memory error? | Young | CFX | 3 | September 30, 2008 12:33 |
CFX CPU time & real time | Nick Strantzias | CFX | 8 | July 23, 2006 18:50 |
*** Run-time memory configuration error *** | M. Sartori | CFX | 1 | November 11, 2005 18:39 |
Run-time memory configuration error | Kramer | CFX | 4 | June 28, 2005 18:42 |