|
[Sponsors] |
[OpenFOAM.com] problems running in parallel on Mac OS X and Windows: only 1 cpu |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
April 26, 2016, 16:18 |
problems running in parallel on Mac OS X and Windows: only 1 cpu
|
#1 |
New Member
Thomas Evans
Join Date: Dec 2015
Posts: 21
Rep Power: 11 |
Okay, I have OpenFOAM+ under docker running under Mac OS X 10.10.5 (Yosemite), and I can generate output. Alas, when I try to run in parallel, e.g.,
% mpirun -np 6 icoFoam -parallel it only runs on one core. I imagine this is a configuration issue, but I haven't a clue where to start. |
|
April 29, 2016, 18:25 |
|
#2 |
Senior Member
Pawan Ghildiyal
Join Date: Nov 2015
Posts: 135
Rep Power: 11 |
Hi... docker in MAC uses virtual box.. Try to open virtual box in MAC and in its settings change number of processor to 6 or 8 whatever is available as well increase memory.
hope this answer your question |
|
September 22, 2016, 05:20 |
|
#3 |
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
I have 24 cores on a Windows workstation. I set Virtual Box to run on 22 processors, but it is only running up to 8 processors in parallel.
Does anyone know why? Best regards, Rudolf |
|
September 22, 2016, 06:21 |
|
#4 |
Senior Member
Pawan Ghildiyal
Join Date: Nov 2015
Posts: 135
Rep Power: 11 |
How are you checking that it is only using 8 processor in parallel ?
|
|
September 22, 2016, 08:42 |
|
#5 | |
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
Quote:
Meanwhile, OpenFOAM+ has decomposed my mesh in 20 pieces, and the runParallel script is saying that is using 20 processes. (?!!!) I suppose mpi-run is sending the 20 mesh pieces to the 8 cores many times during the simulation. I also have a feeling that it is slower to solve by decomposing into 20 pieces, than it is if I decompose it into 8 pieces. Best regards, Rudolf |
||
September 24, 2016, 16:31 |
|
#6 | |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Quote:
Code:
lscpu Code:
CPU(s): On-line CPU(s) list: Thread(s) per core: Core(s) per socket: Socket(s): NUMA node(s):
__________________
|
||
September 26, 2016, 05:10 |
|
#7 |
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
Bom dia, Bruno.
Thanks for replying. The result of lscpu is below. This post also has a screen print showing the virtual box settings. I am new to virtual machines, which seems to be the bottleneck here. I don't know what else I can change besides of giving it access to 22 processors. I'd appreciate if you could explain me what I could make to run OpenFOAM+ with full power. $lscpu: Code:
CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 NUMA node(s): ? (info not displayed. I suppose it's 1.) Rudolf |
|
September 26, 2016, 18:59 |
|
#8 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Hi Rudolf,
Quick request: The image from the desktop did help get a bit of a perspective... but any chance you can also show the configuration windows for the CPU settings for this virtual machine in Virtualbox? I ask this because perhaps Virtualbox is giving any messages regarding the configuration? Because if lscpu is telling us that only 8 cores exist, then something weird is going on. One possibility that comes to mind would be that the "of_plus_1606" container was limited somehow to only using 8 cores... @Pawan: Have you had any similar experience with this? Best regards, Bruno |
|
September 27, 2016, 13:03 |
|
#9 |
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
Hi Bruno,
I will be away from that machine for a couple of weeks. I am going to write back here then. Thanks for helping, this is very appreciated. Cheers, Rudolf |
|
October 10, 2016, 06:25 |
|
#10 | |
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
Hi Bruno,
I am attaching the screen capture of the Virtualbox configuration windows for the CPU settings. Is this really any helpful? Thanks for the aid again. Obrigado, Rudolf Quote:
|
||
October 10, 2016, 06:33 |
|
#11 |
Senior Member
Pawan Ghildiyal
Join Date: Nov 2015
Posts: 135
Rep Power: 11 |
Hi..
Thanks for snapshot . It is strange . Since i do not have windows machine with such higher core so could not check it here. However i will look into this . Thanks Pawan |
|
October 10, 2016, 08:51 |
|
#12 |
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
Could this 8 cores limit be set on docker, instead of VirtualBox?
|
|
October 10, 2016, 20:11 |
|
#13 | |||
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Quick answers:
Quote:
Quote:
Can you please try and upgrade your VirtualBox installation to the latest one? It shouldn't affect the Docker VM, although you might need to first shutdown the Docker VM. By the way, just to play it safe, what is the error message that the dialogue box is showing in the bottom bar? Namely where it states: Quote:
|
||||
October 11, 2016, 07:55 |
|
#14 | |||||
Member
Rudolf Hellmuth
Join Date: Sep 2012
Location: Dundee, Scotland
Posts: 40
Rep Power: 14 |
I suppose it is working with 22 cores now, but I don't know what step has made it right. I am going to describe everything that I've done.
Quote:
Quote:
It worked at least. Then, docker was having SSH problems (IP something...). I had to delete the default settings, and rerun Docker Quickstart Terminal. Afterwards, I've got the following error message: Quote:
Quote:
The new lscpu: Code:
CPU(s): 22 On-line CPU(s) list: 0-21 Thread(s) per core: 1 Core(s) per socket: 22 Socket(s): 1 Quote:
The problem was solved for me, but I am not sure which of the above steps made the difference. Thanks a million for your help, Bruno. Best regards, Rudolf |
||||||
October 30, 2016, 10:58 |
|
#15 | ||
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Quick answer @Rudolf: Sorry for the delay in responding back to you on this. Since the problem was solved, I didn't give it priority.
Quote:
Quote:
You're welcome! And once again, many thanks for the detailed steps! This is one of those hard-to-isolate issues, because it's not straight forward to reproduce. Therefore, all of these steps can come in handy! |
|||
July 1, 2019, 19:27 |
Hola Hola problems with parallel solution mac
|
#16 |
New Member
Yovanny Morales Hernández
Join Date: May 2018
Posts: 11
Rep Power: 8 |
I'm a new foamer and I think I got a problem!
I run a tutorial for parallel solution without the parallel option and it gave the next final information on the log file: Code:
Time = 627 smoothSolver: Solving for Ux, Initial residual = 1.22514118e-05, Final residual = 1.15094602e-06, No Iterations 8 smoothSolver: Solving for Uy, Initial residual = 1.22504357e-05, Final residual = 1.15430392e-06, No Iterations 8 smoothSolver: Solving for Uz, Initial residual = 9.99532834e-05, Final residual = 7.21196592e-06, No Iterations 5 GAMG: Solving for p, Initial residual = 2.71922536e-05, Final residual = 1.25330464e-06, No Iterations 2 time step continuity errors : sum local = 9.86043155e-05, global = 6.98390332e-17, cumulative = 1.43705357e-14 ExecutionTime = 698.01 s ClockTime = 707 s SIMPLE solution converged in 627 iterations End Code:
Time = 627 smoothSolver: Solving for Ux, Initial residual = 1.22514118e-05, Final residual = 1.15094602e-06, No Iterations 8 smoothSolver: Solving for Uy, Initial residual = 1.22504357e-05, Final residual = 1.15430392e-06, No Iterations 8 smoothSolver: Solving for Uz, Initial residual = 9.99532834e-05, Final residual = 7.21196592e-06, No Iterations 5 GAMG: Solving for p, Initial residual = 2.71922536e-05, Final residual = 1.25330464e-06, No Iterations 2 time step continuity errors : sum local = 9.86043155e-05, global = 6.98390332e-17, cumulative = 1.43705357e-14 ExecutionTime = 885.54 s ClockTime = 926 s SIMPLE solution converged in 627 iterations End My virtual machine is working with two processors as can be seen in the configuration of the docker, watch the image attached. Then I checked the CPU with the command lscpu as recommended by wyldckat and the terminal gave this information: Code:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 42 Model name: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz Stepping: 7 CPU MHz: 2691.962 BogoMIPS: 5383.92 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 4096K Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq dtes64 ds_cpl ssse3 cx16 xtpr pcid sse4_1 sse4_2 popcnt aes xsave avx hypervisor lahf_lm kaiser xsaveopt arat When I execute the decomposeParDict file I got the next information Code:
/*---------------------------------------------------------------------------*\ ========= | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox \\ / O peration | Website: https://openfoam.org \\ / A nd | Version: 6 \\/ M anipulation | \*---------------------------------------------------------------------------*/ Build : 6-fa1285188035 Exec : decomposePar Date : Jul 01 2019 Time : 22:21:28 Host : "f5b1076b78d1" PID : 8530 I/O : uncollated Case : /home/openfoam/taylor_couetteParallel nProcs : 1 sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10) allowSystemOperations : Allowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time Decomposing mesh region0 Create mesh Calculating distribution of cells Selecting decompositionMethod simple Finished decomposition in 0.23 s Calculating original mesh data Distributing cells to processors Distributing faces to processors Distributing points to processors Constructing processor meshes Processor 0 Number of cells = 128932 Number of faces shared with processor 1 = 952 Number of processor patches = 1 Number of processor faces = 952 Number of boundary faces = 28560 Processor 1 Number of cells = 128932 Number of faces shared with processor 0 = 952 Number of processor patches = 1 Number of processor faces = 952 Number of boundary faces = 28560 Number of processor faces = 952 Max number of cells = 128932 (0% above average 128932) Max number of processor patches = 1 (0% above average 1) Max number of faces between processors = 952 (0% above average 952) Time = 0 Processor 0: field transfer Processor 1: field transfer End Last edited by wyldckat; July 9, 2019 at 19:48. Reason: Added [CODE][/CODE] markers |
|
July 9, 2019, 19:50 |
|
#17 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Quick question/answer: How exactly did you run the solver in parallel? What was the command you used?
Because if you used mpirun manually, then you probably forget to add at the end the "-parallel" option in order for the solver to truly run in parallel. Without it, you solved the same case twice, taking 2x times the RAM, which would explain it being slower.
__________________
|
|
July 24, 2019, 18:47 |
|
#18 |
New Member
Yovanny Morales Hernández
Join Date: May 2018
Posts: 11
Rep Power: 8 |
Indeed! I was running the solver without the -parallel option! A beginner mistake! There will be more! oops!.
|
|
June 5, 2020, 13:06 |
|
#19 | |
New Member
Join Date: May 2020
Posts: 11
Rep Power: 6 |
Quote:
Code:
/Build : 6-4ed10cc0693c Exec : decomposePar Date : Jun 05 2020 Time : 18:02:06 Host : "ubuntu-opsi" PID : 30334 I/O : uncollated Case : /home/ubuntu-vm/run/parallel_test nProcs : 1 sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10) allowSystemOperations : Allowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time Decomposing mesh region0 Create mesh Calculating distribution of cells Selecting decompositionMethod scotch Finished decomposition in 0.01 s Calculating original mesh data Distributing cells to processors Distributing faces to processors Distributing points to processors Constructing processor meshes Processor 0 Number of cells = 2500 Number of faces shared with processor 1 = 42 Number of processor patches = 1 Number of processor faces = 42 Number of boundary faces = 5166 Processor 1 Number of cells = 2500 Number of faces shared with processor 0 = 42 Number of processor patches = 1 Number of processor faces = 42 Number of boundary faces = 5314 Number of processor faces = 42 Max number of cells = 2500 (0% above average 2500) Max number of processor patches = 1 (0% above average 1) Max number of faces between processors = 42 (0% above average 42) Time = 0 Processor 0: field transfer Processor 1: field transfer End Code:
lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 78 Model name: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz Stepping: 3 CPU MHz: 2400.000 BogoMIPS: 4800.00 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 3072K NUMA node0 CPU(s): 0,1 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase avx2 invpcid rdseed clflushopt flush_l1d |
||
June 5, 2020, 15:55 |
|
#20 |
Senior Member
Pawan Ghildiyal
Join Date: Nov 2015
Posts: 135
Rep Power: 11 |
Will be helpful, if you can post log of both serial and parallel run.
are you running the binaries in virtual box ? if yes, please check how many processor and memory have you allocated to virtual box ? |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[General] Running paraview parallel on windows 7 | Naruto | ParaView | 3 | April 22, 2017 08:12 |
Running parallel on Windows | jaydeep | OpenFOAM | 2 | December 6, 2016 16:31 |
Running Parallel on Windows using Python Scripts | amarkkassery | SU2 Installation | 6 | April 4, 2013 13:37 |
CFX11 + Fortran compiler ? | Mohan | CFX | 20 | March 30, 2011 19:56 |
parallel computing problems with CPU time | PAco | FLUENT | 2 | December 16, 2005 13:37 |