|
[Sponsors] |
March 14, 2019, 08:07 |
|
#181 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
hi guys could you please tell me how to edit the "run" script in the motorbike case in order to run the simulation on a 2 node cluster in an authomatic way? i don't know how to "tell" the script to read the " --hostfile machinefile " and the machinefile should change in the following way:
A) Master cpu = 2 node cpu = 2 B) Master cpu=3 node cpu=3 C) master cpu=4 node cpu =4 D) master cpu=6 node cpu=4 Could anyone give me kindly any suggestion please? Astan |
|
March 14, 2019, 08:14 |
|
#182 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
This is one way (how I do it):
(It also repeats the calculation 3 times) Also, I have a NFS mounted folder called "cloud" as you can see in the example. Note that you might want to edit the Allmesh file as well in order to mesh using the cluster. I would probably just update the "machines" manually. Code:
#!/bin/bash for (( t=0; t<3; t++ )); do # Clear old data for i in 14; do cd run_${i} if [ -d ./100 ]; then # Control will enter here if $DIRECTORY exists and delete it if so. rm -r 100 fi x=${i} for (( c=0; c<x; c++ ));do if [ -d ./processor${c} ]; then cd processor${c} if [ -d ./100 ]; then # Control will enter here if $DIRECTORY exists and delete it if so. rm -r 100 fi cd .. fi done if [ -f ./log.simpleFoam ]; then # Control will enter here if $FILE exists and delete it if so. rm log.simpleFoam fi cd .. done # Run cases for i in 14; do echo "Run for ${i}..." cd run_$i if [ $i -eq 1 ] then simpleFoam > log.simpleFoam 2>&1 else #mpirun --hostfile ~/dev/cloud/bench_template/machines -np ${i} simpleFoam -parallel mpirun --hostfile ~/dev/cloud/bench_template/machines -np ${i} --bind-to core simpleFoam -parallel > log.simpleFoam 2>&1 fi cd .. done # Extract times echo "# cores Wall time (s):" echo "------------------------" for i in 14; do echo $i `grep Execution run_${i}/log.simpleFoam | tail -n 1 | cut -d " " -f 3` done done |
|
March 14, 2019, 08:27 |
|
#183 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
hi Simbelmynė, thanks you very much for the script, i'll try to run the simulation and post the results.
I've noticed the option " --bind-to-core " it is not the first time i read it, from what i've read on the internet it is used when dealing with clusters, but i don't really understand what it is used for. Could you please explain me what does it do? does the running time decrease with respect to the usual " mpirun --hostfile ecc ecc " without the "bind-to-core" option? Astan |
|
March 14, 2019, 08:48 |
|
#184 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
bind-to-core was something that I tested when the performance was poor. In some cases it can dramatically improve performance, but in my case it did not. Test with it and/or without it.
Also, here is some code that you can add if you wish to have it automated: Code:
for ((t=0; t<4; t++)); do echo "Master cpu = $((t+2))" > machinesFile echo "node cpu = $((t+2))" >> machinesFile if [ ${t} -eq 3 ]; then echo "Master cpu = $((t+3))" > machinesFile echo "node cpu = $((t+1))" >> machinesFile fi done |
|
March 15, 2019, 19:38 |
Also trying Cluster
|
#185 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
Hi Simbelmynė,
I am trying to build a cluster of Dell R810's quad processors. Just got the second one (and the two-node cluster) working. However, it turns out this second one is much slower than the first. 50% slower single core and 200% slower at 32 cores. The important difference is that the first (good) one has all dimm slots filled with 8Gb while the second one has half of the slots filled with 16Gb RDIMMs R4x4. The behaviour indicated a memory bottleneck so I ran mem diagnostics. There were no issues. Also went ahead and did an overall system check. Everything shows a Pass. BIOS settings are same between machines, except that I also disabled prefetch this morning with no effect. It looks like I am just getting half the bandwith with half the slots filled. How can I get around that? Will P.S. The new machine has 4x E7-8870 2.4 Hz and the good machine has 4x E7-4870 2.4 Hz. |
|
March 16, 2019, 04:01 |
|
#186 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
Perhaps the memory is incorrectly populated?
Thermal throttling on one or more CPUs? Did you buy from a reputable source? If not, then perhaps you have engineering samples in the new setup? What operating system do you run? Check the hardware info. Btw, this question is probably better off in a new thread. |
|
March 16, 2019, 09:46 |
|
#187 |
Member
Andrew
Join Date: Mar 2018
Posts: 82
Rep Power: 8 |
thanks you for the information Simbelmynė, i'll try and post the results!
Astan |
|
March 16, 2019, 14:35 |
|
#188 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
A single Dell Poweredge R810 4x(E7-4870 2.4 Ghz x 10 cores/20 threads)
1 2138.42 2 1213.28 4 454.14 6 329.9 8 243.29 10 166.17 12 182.94 16 160.47 20 149.23 24 144.08 30 148.82 32 139.42 36 138.79 40 151.01 Two Dell Poweredge R810 4x(E7-4870 2.4 Ghz x 10 cores/20 threads) First half of processes on one node, the rest on the other. 8 263.8 16 155.51 24 122.77 32 105.56 40 102.47 48 86.83 64 78.25 The speed of the network is just 1Gb Ethernet right now (switch limited), so I might improve on this a bit. The difference for 8 nodes is 20 seconds. Last edited by wkernkamp; March 17, 2019 at 20:53. |
|
March 28, 2019, 21:28 |
Dell R710 benchmark
|
#189 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
Dell Poweredge R710 2x Xeon E5649 2.53ghz 12-Cores / 48gb
memory runs at 1066 MHz. (Max memory speed is 1333 MHz, Faster processors available such as Xeon X5690) Performance with 2x E5649 is: 1 1486.54 2 880.04 4 422.03 6 342.61 8 317.83 12 307.18 |
|
March 30, 2019, 11:57 |
|
#190 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
i7-940 @ 4.2 GHz. DDR3 1600 MT/s, rank 2. Ubuntu 16.04. OpenFOAM v6
4 429.09 Interesting side note: I did not install anything on the computer, in fact I just moved the SSD from a different computer, OS and all (because I am lazy). Correct installation and building from source with a later gcc might improve the results a bit. Quite impressed with the performance of such an old mainstream chip. |
|
March 30, 2019, 13:13 |
|
#191 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Triple-channel memory?
|
|
March 30, 2019, 14:41 |
|
#192 |
Senior Member
Join Date: May 2012
Posts: 551
Rep Power: 16 |
Yes, triple channel.
|
|
March 30, 2019, 15:34 |
|
#193 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
What result does this refer to?
|
|
March 30, 2019, 16:54 |
|
#194 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
What result does this refer to?
|
|
April 2, 2019, 08:52 |
|
#195 |
New Member
Join Date: Aug 2018
Posts: 4
Rep Power: 8 |
Let me share results of two quad-socket machines.
4x Xeon E7-8857v2, 48x16GB 2Rx4 DDR3-1866, CentOS 7.2 (on VMware ESXi), OpenFoam 2.3.x Code:
# cores Wall time (s) speedup: ------------------------------------------------ 1 981.09 1 2 468.17 2.09 4 233.54 4.20 6 161.95 6.05 8 121.9 8.04 12 87.34 11.23 16 67.97 14.43 20 59.46 16.5 24 54 18.16 28 50.64 19.37 32 47.39 20.70 Code:
# cores Wall time (s) speedup: ------------------------------------------------ 1 726.86 1 2 372.47 1.95 4 190.08 3.82 6 133.52 5.44 8 90.02 8.07 12 74.69 9.73 16 52.41 13.86 20 48.08 15.11 24 43.93 16.54 28 35.43 20.51 32 33.17 21.91 40 25.15 28.9 48 23.27 31.23 56 22.18 32.77 64 22.3 32.59 |
|
April 17, 2019, 00:24 |
|
#196 | |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
Quote:
Nice results and interesting. I have bought a few old R810 servers and found that you need to fill all dimm slots and that the Rank of the dimms has to be right so that the total rank per channel is 8. Otherwise you loose 60% in speed on this benchmark. It looks like you are not optimal, but I did not dive into the manuals to see how many dimm slots your machines have. Don't think the virtualization makes a big difference. Let me know if you manage to speed it up even more! Will |
||
April 17, 2019, 03:46 |
|
#197 |
New Member
Join Date: Aug 2018
Posts: 4
Rep Power: 8 |
The memory cartridges are at least fully populated according to HP's best practices (HP DL580 Gen8 and HP DL560 Gen10). Still, there may be room for improvement...
|
|
April 27, 2019, 22:11 |
R710 now with faster E5675 processor 6% faster than E5649
|
#198 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
Dell Poweredge R710
12x4Gb Rdimm 1067Mhz old result: 2xE5649 2.53ghz 6 cores per cpu: Meshing Times: 1 2319.46 2 1526.14 4 840.08 6 653.38 8 547.74 10 540.91 12 533.59 Flow Calculation: 1 1486.54 2 880.04 4 422.03 6 342.61 8 317.83 10 333.38 12 307.18 new result: 2xX5675 3.07ghz 6 cores per cpu Meshing Times: 1 1998.08 2 1313.22 4 719.71 6 558.17 8 466.22 12 449.43 Flow Calculation: 1 1322.84 2 787.4 4 375.77 6 305.44 8 286.3 12 278.02 Last edited by wkernkamp; May 1, 2019 at 01:49. |
|
May 19, 2019, 02:57 |
2x E5-4627 v2 rdimm 16x 8Gb 1866 GHz
|
#199 |
Senior Member
Will Kernkamp
Join Date: Jun 2014
Posts: 371
Rep Power: 14 |
2x E5-4627 v2 rdimm 16x 8Gb 1866 GHz
Meshing Times: 1 1480.36 2 992.82 4 542.2 8 329.79 12 294.62 14 245 16 246.43 Flow Calculation: 1 938.32 2 506.25 4 236.32 8 131.57 12 108.1 14 102.96 16 101.14 |
|
July 8, 2019, 22:24 |
|
#200 |
Member
Hector
Join Date: Jul 2010
Location: Barcelona
Posts: 30
Rep Power: 16 |
I am wondering about adding renumberMesh to the process and how it will change speed-up results and/or absolute values of each wall time.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to contribute to the community of OpenFOAM users and to the OpenFOAM technology | wyldckat | OpenFOAM | 17 | November 10, 2017 16:54 |
UNIGE February 13th-17th - 2107. OpenFOAM advaced training days | joegi.geo | OpenFOAM Announcements from Other Sources | 0 | October 1, 2016 20:20 |
OpenFOAM Training Beijing 22-26 Aug 2016 | cfd.direct | OpenFOAM Announcements from Other Sources | 0 | May 3, 2016 05:57 |
New OpenFOAM Forum Structure | jola | OpenFOAM | 2 | October 19, 2011 07:55 |
Hardware for OpenFOAM LES | LijieNPIC | Hardware | 0 | November 8, 2010 10:54 |