CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Installation

[OpenFOAM.org] MPI compiling and version mismatch

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   June 6, 2015, 08:59
Default MPI compiling and version mismatch
  #1
pki
New Member
 
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11
pki is on a distinguished road
Hi.

I managed to run OpenFOAM 2.4.0 (deb package on ubuntu) on 2 workstations. A third workstation i am updating to install same versions so i expect no problems. I have openmpi 1.6.5 there.

Then i have two servers with debian squeeze. I managed to compile OpenFoam 2.4.0. This have OpenMPI 1.4.x. OpenFOAM as single runs fine. I compiled it following this manual: https://openfoamwiki.net/index.php/I...u#Ubuntu_10.04

Later i downloaded the source of OpenMPI 1.6.5 and compiled it with the configuration of:
Code:
./configure --prefix="/usr/local" --disable-orterun-prefix-by-default --enable-shared --disable-static --disable-mpi-f77 --disable-mpi-f90 --disable-mpi-profile --with-sge --libdir="/usr/local/lib64"
i tried also
Code:
./configure --prefix="/usr" --disable-orterun-prefix-by-default --enable-shared --disable-static --disable-mpi-f77 --disable-mpi-f90 --disable-mpi-profile --with-sge --libdir="/usr/local/lib64"
I managed to start MPI with "ls" or "mpirun --version" and get a correct answer:
Code:
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ mpirun -hostfile machines -np 4 mpirun --version
mpirun (Open MPI) 1.6.5

Report bugs to http://www.open-mpi.org/community/help/
mpirun (Open MPI) 1.6.5

Report bugs to http://www.open-mpi.org/community/help/
mpirun (Open MPI) 1.6.5

Report bugs to http://www.open-mpi.org/community/help/
mpirun (Open MPI) 1.6.5

Report bugs to http://www.open-mpi.org/community/help/
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$
But, when i try to run a solver in mpi i get this:
Code:
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ mpirun -hostfile machines -np 4 compressibleInterFoam -parallel
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_base_build_nidmap failed
  --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../orte/util/nidmap.c at line 371
[srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../../../orte/mca/ess/base/ess_base_nidmap.c at line 62
[srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../../../../orte/mca/ess/env/ess_env_module.c at line 173
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_set_name failed
  --> Returned value Data unpack would read past end of buffer (-26) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[srv2:16051] [[47027,1],2] ORTE_ERROR_LOG: Data unpack would read past end of buffer in file ../../../orte/runtime/orte_init.c at line 132
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Data unpack would read past end of buffer" (-26) instead of "Success" (0)
--------------------------------------------------------------------------
*** The MPI_Init() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[srv2:16051] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 16051 on
node ks-mars exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$
Can you help?
pki is offline   Reply With Quote

Old   June 6, 2015, 09:35
Default
  #2
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer:
  1. If you're using a custom Open-MPI, then you need to compile OpenFOAM with that version as well.
  2. The link you provided is for Ubuntu 10.04.
    Either way, all instructions on that page indicate to build with the system's Open-MPI, not with a custom MPI.
  3. In other words, you need to change the option "WM_MPLIB=SYSTEMOPENMPI" to "WM_MPLIB=OPENMPI".
  4. More details on how the "WM_MPLIB=OPENMPI" is configured for using a custom Open-MPI are in the file:
    Code:
    $WM_PROJECT_DIR/etc/config/settings.sh
wyldckat is offline   Reply With Quote

Old   June 6, 2015, 09:51
Default
  #3
pki
New Member
 
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11
pki is on a distinguished road
OK, many thanks for your quick answer.

Quote:
The link you provided is for Ubuntu 10.04.
It is the nearest to debian squeeeze i found


As i deinstalled the systems openmpi and compiled OpenMPI-1.6.5 in a seperate directory, installed it - is it now the SYSTEMOPENMPI?

The OpenMPI delivered with OpenFoam 2.4.0 is 1.8.x, will it work if the main workstation where i starting have OpenMPI 1.6.5? Or need I change the version in settings.sh and swap the openmpi directory in thirdparty dir to openmpi 1.6.5?
pki is offline   Reply With Quote

Old   June 6, 2015, 11:27
Default
  #4
pki
New Member
 
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11
pki is on a distinguished road
OK, i tried the second way.

Copied the source of openmpi-1.6.5 into the ThirdParty directory, changed the version in settings.sh and set WM_MPLIB=OPENMPI. Compiled. It's using the compiled openmpi file now.
Code:
$ which mpirun
/home/piotr/OpenFOAM/ThirdParty-2.4.0/platforms/linuxGcc48/openmpi-1.6.5/bin/mpirun
.

I set in the .bashrc on the top:
Code:
source $HOME/OpenFOAM/OpenFOAM-2.4.0/etc/bashrc WM_NCOMPPROCS=4 WM_MPLIB=OPENMPI foamCompiler=ThirdParty WM_COMPILER=Gcc48 WM_ARCH_OPTION=32
I can run the solver with mpi on the server locally which works, but trying to run from my workstation gives me an error:
Code:
piotr@biuro-4:~/OpenFOAM/piotr-2.4.0/run/tutorials/multiphase/compressibleInterFoam/laminar/depthCharge3D$ mpirun -hostfile machines -np 4 compressibleInterFoam -parallel
[3] 
[3] 
[3] --> FOAM FATAL IO ERROR: 
[3] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token
[3] 
[3] file: /*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  2.4.0                                 |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 2.4.0-dcea1e13ff76
Exec   : compressibleInterFoam -parallel
Date   : Jun 06 2015
Time   : 16:24:11
Host   : "srv2"
PID    : 16612
IOstream at line 0.
[3] 
[3]     From function IOstream::fatalCheck(const char*) const
[3]     in file db/IOstreams/IOstreams/IOstream.C at line 114.
[3] 
FOAM parallel run exiting
[3] 
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 15886 on
node biuro-4 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
The [3] process is the process on the workstation. (biuro-4)

Have you an idea what the problem is? Or need i to compile OpenFOAM on every workstation now to get exactly same versions?
biuro-4 is ubuntu 14.04 with package installed, srv2 is server with debian and compiled openfoam+openmpi.
pki is offline   Reply With Quote

Old   June 9, 2015, 11:56
Default
  #5
pki
New Member
 
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11
pki is on a distinguished road
Now i go for openmpi 1.8.5 as delivered in the OpenFoam 2.4.0 package with ThirdParty.

I was missing there two directories. The first was needed to compile.

$ diff -r ThirdParty-2.4.0/openmpi-1.8.5/ ../openmpi-1.8.5
Only in ../openmpi-1.8.5/contrib/dist/mofed: debian
Only in ../openmpi-1.8.5/ompi/contrib/vt/vt/tools/vtsetup: src

After I copied the mofed/debian directory from the original openmpi source i can compile OpenFoam. Maybe it's only need for debian/ubuntu?
pki is offline   Reply With Quote

Old   June 15, 2015, 16:56
Default
  #6
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi pki,

OK, I've finally managed to come back to your questions and I think I managed to figure out why the problem is occurring this time.
Unfortunately, the installation instructions I've been writing on the wiki are mostly for using in single machine. For example, this line of code:
Code:
source $HOME/OpenFOAM/OpenFOAM-2.4.0/etc/bashrc WM_NCOMPPROCS=4 WM_MPLIB=OPENMPI foamCompiler=ThirdParty WM_COMPILER=Gcc48 WM_ARCH_OPTION=32
works fine in a single machine. But when we try to run in parallel, the shell environment on all machines isn't loaded in the same way

The solution is to:
  1. Edit the file "$HOME/OpenFOAM/OpenFOAM-2.4.0/etc/bashrc" and edit the values for the respective environment variables, namely:
    Code:
    WM_MPLIB=OPENMPI
    foamCompiler=ThirdParty
    WM_COMPILER=Gcc48
    WM_ARCH_OPTION=32
    • Or place these lines above inside the file "$HOME/OpenFOAM/OpenFOAM-2.4.0/etc/prefs.sh", which will be loaded automatically by OpenFOAM's "bashrc".
    • WARNING: Make sure that all machines have the same installation settings, e.g. "WM_ARCH_OPTION=32". Otherwise, it might try to load incompatible binaries for running in parallel.
  2. Then edit your personal "~/.bashrc" file and do not use the alias trick. You will have to use the direct sourcing of the shell environment, for example, have the following command as the last command in your "~/.bashrc" file:
    Code:
    source $HOME/OpenFOAM/OpenFOAM-2.4.0/etc/bashrc
  3. Make sure that either all machines use the same shared "/home" folder or make sure that all machines have OpenFOAM installed in the same exact path on each machine and that your "~/.bashrc" file on all machines have the same "source ..." command at the end of the file.
  4. Make sure that you do not have any other sourcing commands for other OpenFOAM versions working in your home environments on each machine.
Hopefully this will solve the problem you're having.


Best regards,
Bruno
__________________
wyldckat is offline   Reply With Quote

Old   June 15, 2015, 17:01
Default
  #7
pki
New Member
 
Piotr Kuna
Join Date: Jun 2015
Posts: 8
Rep Power: 11
pki is on a distinguished road
Many thanks, you are great. A question, so all machine have to be 32bit OS or 64bit OS, mixing is not allowed?

Or set WM_ARCH_OPTION=32 on all machines, even with amd64 OS?
pki is offline   Reply With Quote

Old   June 15, 2015, 17:21
Default
  #8
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,982
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
If at least one machine is 32-bit, then you should use the 32-bit build on all machines.

Mixed platforms (32 and 64-bit) is very complicated to achieve and I'm not familiar with anyone reporting success with such a set-up here on the forum. In addition, I'm not sure that Open-MPI supports this option... but I do vaguely remember that some MPI toolboxes to support this.

Either way, this should tell you which bit architecture you have on a machine:
Code:
uname -m
wyldckat is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem in compiling a solver made for a different version (v2.0 ->v4.1) JLS OpenFOAM Programming & Development 2 July 9, 2019 15:03
Compiling new liquids libraries in OpenFOAM 6.0 version ruamojica OpenFOAM Programming & Development 1 September 26, 2018 21:06
Is there a relation between version of fluent and visual basic in compiling? 4asino Fluent UDF and Scheme Programming 1 October 31, 2015 07:03
Compiling OpenFOAM 1.7.1 on Ubuntu 10.10 samiam1000 OpenFOAM Installation 4 November 24, 2010 09:00
ODETest.C Compiling failed in version 1.6 sxhdhi OpenFOAM Bugs 4 April 27, 2010 06:36


All times are GMT -4. The time now is 22:27.