|
[Sponsors] |
[OpenFOAM.org] MPI compiling and version mismatch |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
June 6, 2015, 08:59 |
MPI compiling and version mismatch
|
#1 |
New Member
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11 |
Hi.
I managed to run OpenFOAM 2.4.0 (deb package on ubuntu) on 2 workstations. A third workstation i am updating to install same versions so i expect no problems. I have openmpi 1.6.5 there. Then i have two servers with debian squeeze. I managed to compile OpenFoam 2.4.0. This have OpenMPI 1.4.x. OpenFOAM as single runs fine. I compiled it following this manual: https://openfoamwiki.net/index.php/I...u#Ubuntu_10.04 Later i downloaded the source of OpenMPI 1.6.5 and compiled it with the configuration of: Code:
./configure --prefix="/usr/local" --disable-orterun-prefix-by-default --enable-shared --disable-static --disable-mpi-f77 --disable-mpi-f90 --disable-mpi-profile --with-sge --libdir="/usr/local/lib64" Code:
./configure --prefix="/usr" --disable-orterun-prefix-by-default --enable-shared --disable-static --disable-mpi-f77 --disable-mpi-f90 --disable-mpi-profile --with-sge --libdir="/usr/local/lib64" Code:
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ mpirun -hostfile machines -np 4 mpirun --version mpirun (Open MPI) 1.6.5 Report bugs to http://www.open-mpi.org/community/help/ mpirun (Open MPI) 1.6.5 Report bugs to http://www.open-mpi.org/community/help/ mpirun (Open MPI) 1.6.5 Report bugs to http://www.open-mpi.org/community/help/ mpirun (Open MPI) 1.6.5 Report bugs to http://www.open-mpi.org/community/help/ piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ Code:
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ mpirun -hostfile machines -np 4 compressibleInterFoam -parallel -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_build_nidmap failed --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS -------------------------------------------------------------------------- [srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../orte/util/nidmap.c at line 371 [srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../../../orte/mca/ess/base/ess_base_nidmap.c at line 62 [srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../../../../orte/mca/ess/env/ess_env_module.c at line 173 -------------------------------------------------------------------------- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_set_name failed --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS -------------------------------------------------------------------------- [srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../orte/runtime/orte_init.c at line 132 -------------------------------------------------------------------------- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): ompi_mpi_init: orte_init failed --> Returned "Data unpack would read past end of buffer" (-26) instead of "Success" (0) -------------------------------------------------------------------------- *** The MPI_Init() function was called before MPI_INIT was invoked. *** This is disallowed by the MPI standard. *** Your MPI job will now abort. [srv2:16051] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed! -------------------------------------------------------------------------- mpirun has exited due to process rank 2 with PID 16051 on node ks-mars exiting improperly. There are two reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ |
|
June 6, 2015, 09:35 |
|
#2 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Quick answer:
|
|
June 6, 2015, 09:51 |
|
#3 | |
New Member
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11 |
OK, many thanks for your quick answer.
Quote:
As i deinstalled the systems openmpi and compiled OpenMPI-1.6.5 in a seperate directory, installed it - is it now the SYSTEMOPENMPI? The OpenMPI delivered with OpenFoam 2.4.0 is 1.8.x, will it work if the main workstation where i starting have OpenMPI 1.6.5? Or need I change the version in settings.sh and swap the openmpi directory in thirdparty dir to openmpi 1.6.5? |
||
June 6, 2015, 11:27 |
|
#4 |
New Member
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11 |
OK, i tried the second way.
Copied the source of openmpi-1.6.5 into the ThirdParty directory, changed the version in settings.sh and set WM_MPLIB=OPENMPI. Compiled. It's using the compiled openmpi file now. Code:
$ which mpirun /home/piotr/OpenFOAM/ThirdParty-2.4.0/platforms/linuxGcc48/openmpi-1.6.5/bin/mpirun I set in the .bashrc on the top: Code:
source $HOME/OpenFOAM/OpenFOAM-2.4.0/etc/bashrc WM_NCOMPPROCS=4 WM_MPLIB=OPENMPI foamCompiler=ThirdParty WM_COMPILER=Gcc48 WM_ARCH_OPTION=32 Code:
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ mpirun -hostfile machines -np 4 compressibleInterFoam -parallel [3] [3] [3] --> FOAM FATAL IO ERROR: [3] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token [3] [3] file: /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2.4.0 | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : 2.4.0-dcea1e13ff76 Exec : compressibleInterFoam -parallel Date : Jun 06 2015 Time : 16:24:11 Host : "srv2" PID : 16612 IOstream at line 0. [3] [3] From function IOstream::fatalCheck(const char*) const [3] in file db/IOstreams/IOstreams/IOstream.C at line 114. [3] FOAM parallel run exiting [3] -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD with errorcode 1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun has exited due to process rank 3 with PID 15886 on node biuro-4 exiting improperly. There are two reasons this could occur: 1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination. 2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination" This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- Have you an idea what the problem is? Or need i to compile OpenFOAM on every workstation now to get exactly same versions? biuro-4 is ubuntu 14.04 with package installed, srv2 is server with debian and compiled openfoam+openmpi. |
|
June 9, 2015, 11:56 |
|
#5 |
New Member
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11 |
Now i go for openmpi 1.8.5 as delivered in the OpenFoam 2.4.0 package with ThirdParty.
I was missing there two directories. The first was needed to compile. $ diff -r ThirdParty-2.4.0/openmpi-1.8.5/ ../openmpi-1.8.5 Only in ../openmpi-1.8.5/contrib/dist/mofed: debian Only in ../openmpi-1.8.5/ompi/contrib/vt/vt/tools/vtsetup: src After I copied the mofed/debian directory from the original openmpi source i can compile OpenFoam. Maybe it's only need for debian/ubuntu? |
|
June 15, 2015, 16:56 |
|
#6 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
Hi pki,
OK, I've finally managed to come back to your questions and I think I managed to figure out why the problem is occurring this time. Unfortunately, the installation instructions I've been writing on the wiki are mostly for using in single machine. For example, this line of code: Code:
source $HOME/OpenFOAM/OpenFOAM-2.4.0/etc/bashrc WM_NCOMPPROCS=4 WM_MPLIB=OPENMPI foamCompiler=ThirdParty WM_COMPILER=Gcc48 WM_ARCH_OPTION=32 The solution is to:
Best regards, Bruno
__________________
|
|
June 15, 2015, 17:01 |
|
#7 |
New Member
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11 |
Many thanks, you are great. A question, so all machine have to be 32bit OS or 64bit OS, mixing is not allowed?
Or set WM_ARCH_OPTION=32 on all machines, even with amd64 OS? |
|
June 15, 2015, 17:21 |
|
#8 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128 |
If at least one machine is 32-bit, then you should use the 32-bit build on all machines.
Mixed platforms (32 and 64-bit) is very complicated to achieve and I'm not familiar with anyone reporting success with such a set-up here on the forum. In addition, I'm not sure that Open-MPI supports this option... but I do vaguely remember that some MPI toolboxes to support this. Either way, this should tell you which bit architecture you have on a machine: Code:
uname -m |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem in compiling a solver made for a different version (v2.0 ->v4.1) | JLS | OpenFOAM Programming & Development | 2 | July 9, 2019 15:03 |
Compiling new liquids libraries in OpenFOAM 6.0 version | ruamojica | OpenFOAM Programming & Development | 1 | September 26, 2018 21:06 |
Is there a relation between version of fluent and visual basic in compiling? | 4asino | Fluent UDF and Scheme Programming | 1 | October 31, 2015 07:03 |
Compiling OpenFOAM 1.7.1 on Ubuntu 10.10 | samiam1000 | OpenFOAM Installation | 4 | November 24, 2010 09:00 |
ODETest.C Compiling failed in version 1.6 | sxhdhi | OpenFOAM Bugs | 4 | April 27, 2010 06:36 |