|
[Sponsors] |
April 22, 2014, 12:42 |
Problems running OpenFOAM 2.3 in parallel
|
#1 | |
Senior Member
Vincent RIVOLA
Join Date: Mar 2009
Location: France
Posts: 283
Rep Power: 18 |
Dear all,
Since I installed OpenFOAM 2.3 I've not been able to use it in parallel. I don't know why. It's been working perfectly for years with the previous versions and this one is giving me headache with two different machines. I am using Ubuntu 12.04, and I get the following error as soon as I try to run in parallel (this exemple is with Allrun in motorbike tutorial, but it's the same for every solver): Quote:
Regarding the setup I used the source files and compiled everything. After few times I managed to get no compilation errors but I am not able to run the cases in parallel yet. Thanks for your help Vincent |
||
April 22, 2014, 15:28 |
|
#2 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Greetings Vincent,
Which installation instructions did you follow? Because according to the output you've provided, the problem is that the shell environment is configured to using the custom Open-MPI 1.6.5 that comes with OpenFOAM's ThirdParty package, but it's instead using the "libmpi.so" library present in your system, which is not compatible. Best regards, Bruno
__________________
|
|
April 23, 2014, 02:19 |
|
#3 |
Member
Christian Butcher
Join Date: Jul 2013
Location: Japan
Posts: 85
Rep Power: 13 |
Possibly on the same topic, does OF-2.3.0 have a higher requirement of some kind for the version of OpenMPI?
Currently I have an installation of OF-2.3.0 on the cluster I work with, and for values of $NSLOTS less than or equal to 14, everything works perfectly. When I try and run with more then 14 processors, I get errors like: Code:
qrsh_starter: executing child process (null) failed: No such file or directory -------------------------------------------------------------------------- A daemon (pid 13339) died unexpectedly with status 1 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. FOAM_MPI = openmpi-system and WM_MPLIB = SYSTEMOPENMPI With both 14 and 80 processors, the mpirun command is used via a qsub'd script (Sun Grid Engine) I'm further confused about the number 14. The cluster contains a collection of nodes, each with two 8-core processors, ie, 16 processing cores per node. Consequently, a limit of 16 would make me think I have problems communicating between nodes (although I have password-less ssh connections), but 14 seems a little peculiar. Edit: Pretty sure this is actually due to memory limits - the amount of memory I requested was slightly higher than the mem/proc available, so only 14 of the 16 cores could be used, since 14 * mem/proc was all of the memory on the node. So I guess this isn't curious at all, just when I ask for a 15th processor, it requires a second node. It's been a little while since I tried, but I'm pretty sure under OF-2.2.2 I had 32 cores working without issue. Best, Christian Last edited by chrisb2244; April 23, 2014 at 22:19. Reason: Information about why 14 procs is ok and 15 is not. |
|
April 23, 2014, 03:42 |
|
#4 | |
Senior Member
Vincent RIVOLA
Join Date: Mar 2009
Location: France
Posts: 283
Rep Power: 18 |
Quote:
Thanks for your reply. Actually, I would like to run my system mpirun, which is the one I normally used with the previous versions of OpenFOAM. But even explicitely calling the system mpirun (/usr/bin/mpirun -np 6 snappyHexMesh -parallel) I get a similar error: Code:
-------------------------------------------------------------------------- A requested component was not found, or was unable to be opened. This means that this component is either not installed or is unable to be used on your system (e.g., sometimes this means that shared libraries that the component requires are unable to be found/loaded). Note that Open MPI stopped checking at the first component that it did not find. Host: carbon Framework: crs Component: none -------------------------------------------------------------------------- [carbon:22893] *** Process received signal *** [carbon:22893] Signal: Segmentation fault (11) [carbon:22893] Signal code: Address not mapped (1) [carbon:22893] Failing at address: 0x28 [carbon:22893] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) [0x2aff99ca4cb0] [carbon:22893] [ 1] /usr/lib/libopen-pal.so.0(mca_base_select+0x108) [0x2aff99a41518] [carbon:22893] [ 2] /usr/lib/libopen-pal.so.0(opal_crs_base_select+0x7e) [0x2aff99a5390e] [carbon:22893] [ 3] /usr/lib/libopen-pal.so.0(opal_cr_init+0x31e) [0x2aff99a320ee] [carbon:22893] [ 4] /usr/lib/libopen-pal.so.0(opal_init+0x159) [0x2aff99a31a59] [carbon:22893] [ 5] /usr/lib/libopen-rte.so.0(orte_init+0x4d) [0x2aff997dea0d] [carbon:22893] [ 6] /usr/bin/mpirun() [0x402fe5] [carbon:22893] [ 7] /usr/bin/mpirun() [0x402b34] [carbon:22893] [ 8] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x2aff99ed376d] [carbon:22893] [ 9] /usr/bin/mpirun() [0x402a59] [carbon:22893] *** End of error message *** Segmentation fault (core dumped) Regarding, the instructions I tried to follow the ones I found on openfoam.com: http://www.openfoam.org/download/source.php What do you suggest to fix this setup? Some more information, this is my LD_LIBRARY_PATH: Code:
echo $LD_LIBRARY_PATH /home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/CGAL-4.3/lib:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/ParaView-4.1.0/lib/paraview-4.1:/home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/openmpi-1.6.5/lib:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64Gcc/openmpi-1.6.5/lib64:/home/vincent/OpenFOAM/vincent-2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/site/2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64GccDPOpt/lib:/home/vincent/OpenFOAM/OpenFOAM-2.3.0/platforms/linux64GccDPOpt/lib/dummy (carbon) ~/OpenFOAM/vincent-2.3.0/run/tutorials/incompressible/simpleFoam/motorBike > ls -latr /home/vincent/OpenFOAM/ThirdParty-2.3.0/platforms/linux64GccDPOpt/lib/openmpi-1.6.5 Last edited by wyldckat; April 25, 2014 at 15:56. Reason: merged posts <1h apart and changed QUOTE to CODE |
||
April 25, 2014, 16:04 |
|
#5 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Greetings to all!
@Christian: If I read your post correctly, you figured out that the problem was that more memory was need than there was available on the 1st node. Therefore, mystery solved @Vincent: If you followed the instructions from http://www.openfoam.org/download/source.php - and did not modify the setting in the variable "WM_MPLIB" to "SYSTEMOPENMPI", in the file "$HOME/OpenFOAM/OpenFOAM-2.3.0/etc/bashrc", then you have a conflict of settings, because you've built OpenFOAM with the custom Open-MPI and then you're trying to use the system's Open-MPI, which is likely incompatible. To know which mpirun it's being used, run: Code:
which mpirun Code:
source $HOME/OpenFOAM/OpenFOAM-2.3.0/etc/bashrc Bruno
__________________
|
|
June 16, 2019, 11:32 |
|
#6 |
New Member
Elias Trautner
Join Date: Jun 2019
Posts: 4
Rep Power: 7 |
Hello wyldckat, I have a similar issue, thread:
https://www.cfd-online.com/Forums/op...imulation.html It would be very nice if you could check it out and see whether you can help me to get rid of the bug. Thanks in advance! |
|
December 2, 2019, 17:49 |
Problems running OpenFOAM 2.3 in parallel
|
#7 |
Member
Join Date: Mar 2019
Posts: 81
Rep Power: 7 |
I am trying to run OpenFOAM while sharing the resources between two computers. I included the hostfile but am getting the following error:
Code:
[vm2:26669] *** Process received signal *** [vm2:26669] Signal: Segmentation fault (11) [vm2:26669] Signal code: Address not mapped (1) [vm2:26669] Failing at address: 0x5634a8006d6e [vm2:26669] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f51f2147890] [vm2:26669] [ 1] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x3d)[0x7f51f1ddb98d] [vm2:26669] [ 2] /usr/lib/x86_64-linux-gnu/libopen-pal.so.20(opal_argv_free+0x29)[0x7f51f23a2519] [vm2:26669] [ 3] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(+0x283cb)[0x7f51f262e3cb] [vm2:26669] [ 4] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(orte_util_add_hostfile_nodes+0xc1)[0x7f51f262f3f1] [vm2:26669] [ 5] /usr/lib/x86_64-linux-gnu/libopen-rte.so.20(orte_ras_base_allocate+0xd3d)[0x7f51f26607fd] [vm2:26669] [ 6] /usr/lib/x86_64-linux-gnu/libopen-pal.so.20(opal_libevent2022_event_base_loop+0xdc9)[0x7f51f23ba209] [vm2:26669] [ 7] mpirun(+0x74a3)[0x5634a6d7e4a3] [vm2:26669] [ 8] mpirun(+0x5aea)[0x5634a6d7caea] [vm2:26669] [ 9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f51f1d65b97] [vm2:26669] [10] mpirun(+0x59ea)[0x5634a6d7c9ea] [vm2:26669] *** End of error message *** PS: I am using OpenFOAM v1812: Code:
$echo $WM_MPLIB SYSTEMOPENMPI $echo $FOAM_MPI openmpi-system Last edited by mm66; December 3, 2019 at 17:03. |
|
December 3, 2019, 12:03 |
|
#8 |
Member
Join Date: Mar 2019
Posts: 81
Rep Power: 7 |
I figured out what was wrong. In the host file I was using this format:
Code:
user@ip cpu=N Code:
ip cpu=N Last edited by mm66; December 3, 2019 at 17:03. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[ICEM] Problems with coedge curves and surfaces | tommymoose | ANSYS Meshing & Geometry | 6 | December 1, 2020 12:12 |
Can not run OpenFOAM in parallel in clusters, help! | ripperjack | OpenFOAM Running, Solving & CFD | 5 | May 6, 2014 16:25 |
Problems running in parallel - Pstream not available | dark lancer | OpenFOAM Installation | 14 | October 13, 2013 15:13 |
Problem in Running OpenFoam in Parallel | himanshu28 | OpenFOAM Running, Solving & CFD | 1 | July 11, 2013 10:19 |
Something weird encountered when running OpenFOAM in parallel on multiple nodes | xpqiu | OpenFOAM Running, Solving & CFD | 2 | May 2, 2013 05:59 |