CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM

what is wrong with the mpirun parameter -mca ?

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 12, 2010, 23:25
Default what is wrong with the mpirun parameter -mca ?
  #1
New Member
 
LEE
Join Date: Feb 2010
Posts: 4
Rep Power: 16
donno is on a distinguished road
HI,all

I am a newbie to running OF1.6 in parallel, the problem coming to me is that
i can run mpirun in a cluster (suse10.2 with IB, PBS installed) like this :

## it is ok for parallel
#PBS -N mycase
## Submit to specified nodes:
##PBS -S /bin/bash
#PBS -l nodes=1pn=16
#PBS -j oe
#PBS -l walltime=00:10:00
##PBS -q debug

cat $PBS_NODEFILE
##cat $PBS_O_WORKDIR
##cd $PBS_O_WORKDIR

NP=`cat $PBS_NODEFILE|wc -l`

mpirun -np $NP -machinefile $PBS_NODEFILE interFoam -parallel

while it is not ok with the script only one line changed as

mpirun -np $NP -machinefile $PBS_NODEFILE --mca btl self,openib interFoam -parallel

and the output is following :



--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.

Host: node11
Framework: btl
Component: openib
--------------------------------------------------------------------------
[node11:06179] mca: base: components_open: component pml / csum open function failed
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.

Host: node11
Framework: btl
Component: openib
--------------------------------------------------------------------------
[node11:06179] mca: base: components_open: component pml / ob1 open function failed
--------------------------------------------------------------------------
No available pml components were found!

This means that there are no components of this type installed on your
system or all the components reported that they could not be used.

This is a fatal error; your MPI process is likely to abort. Check the
output of the "ompi_info" command and ensure that components of this
type are available on your system. You may also wish to check the
value of the "component_path" MCA parameter and ensure that it has at
least one directory that contains valid MCA components.
--------------------------------------------------------------------------
[node11:06179] PML ob1 cannot be selected
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 6179 on
node node11 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

what is wrong with the parameter -mca ?
donno is offline   Reply With Quote

Old   February 15, 2010, 04:46
Default
  #2
Senior Member
 
Mark Olesen
Join Date: Mar 2009
Location: https://olesenm.github.io/
Posts: 1,715
Rep Power: 40
olesen has a spectacular aura aboutolesen has a spectacular aura about
Quote:
Originally Posted by donno View Post
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.

Host: node11
Framework: btl
Component: openib
--------------------------------------------------------------------------
[node11:06179] mca: base: components_open: component pml / csum open function failed
--------------------------------------------------------------------------
A requested component was not found, or was unable to be opened. This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded). Note that
Open MPI stopped checking at the first component that it did not find.
Does "ompi_info" report that the components in question are available?
olesen is offline   Reply With Quote

Old   February 15, 2010, 09:37
Default
  #3
New Member
 
LEE
Join Date: Feb 2010
Posts: 4
Rep Power: 16
donno is on a distinguished road
ompi_info:

Package: Open MPI henry@dm
Distribution
Open MPI: 1.3.3
Open MPI SVN revision: r21666
Open MPI release date: Jul 14, 2009
Open RTE: 1.3.3
Open RTE SVN revision: r21666
Open RTE release date: Jul 14, 2009
OPAL: 1.3.3
OPAL SVN revision: r21666
OPAL release date: Jul 14, 2009
Ident string: 1.3.3

some related components:

MCA pml: cm (MCA v2.0, API v2.0,
Component v1.3.3)
MCA pml: csum (MCA v2.0, API v2.0,
Component v1.3.3)
MCA pml: ob1 (MCA v2.0, API v2.0,
Component v1.3.3)
MCA pml: v (MCA v2.0, API v2.0,
Component v1.3.3)
MCA bml: r2 (MCA v2.0, API v2.0,
Component v1.3.3)
MCA rcache: vma (MCA v2.0, API v2.0,
Component v1.3.3)
MCA btl: self (MCA v2.0, API v2.0,
Component v1.3.3)
MCA btl: sm (MCA v2.0, API v2.0,
Component v1.3.3)
MCA btl: tcp (MCA v2.0, API v2.0,
Component v1.3.3)

is it wrong with the openib?
donno is offline   Reply With Quote

Old   February 15, 2010, 10:31
Default
  #4
Senior Member
 
BastiL
Join Date: Mar 2009
Posts: 530
Rep Power: 20
bastil is on a distinguished road
I think you need to recompile OpenMPI for openib. Take a look at the Third-Party Allwmake.

Regards BastiL
bastil is offline   Reply With Quote

Old   February 15, 2010, 10:34
Default
  #5
Senior Member
 
Mark Olesen
Join Date: Mar 2009
Location: https://olesenm.github.io/
Posts: 1,715
Rep Power: 40
olesen has a spectacular aura aboutolesen has a spectacular aura about
Quote:
Originally Posted by donno View Post
ompi_info:

Package: Open MPI henry@dm
...
is it wrong with the openib?
It's obviously not configured in the default release. If you examine the Allwmake file in ThirdParty, you'll see something like this:

Code:
        # Infiniband support
        # if [ -d /usr/local/ofed -a -d /usr/local/ofed/lib64 ]
        # then
        #     mpiWith="$mpiWith --with-openib=/usr/local/ofed"
        #     mpiWith="$mpiWith --with-openib-libdir=/usr/local/ofed/lib64"
        # fi
Fix it to suit your configuration and recompile openmpi. Presumably you have the corresponding headers/libraries for infiniband.
While you are at it, you might also consider getting a more recent version (openmpi-1.4.1) - there have been various bugfixes since the 1.3.3 release.
olesen is offline   Reply With Quote

Old   March 24, 2010, 11:34
Default
  #6
bjr
Member
 
Ben Racine
Join Date: Mar 2009
Location: Seattle, WA, USA
Posts: 62
Rep Power: 17
bjr is on a distinguished road
Send a message via AIM to bjr Send a message via Skype™ to bjr
I believe I'm having virtually the exact same problem as the OP Donno... OpenSUSE 11.2 cluster, vanilla OF-1.6, trying to use --mca btl openib,self

http://www.cfd-online.com/Forums/ope...not-quite.html

Did you ever find out what your Allwmake file should look like to get this working? Are you using OFED as I am?
bjr is offline   Reply With Quote

Old   March 24, 2010, 17:00
Default
  #7
bjr
Member
 
Ben Racine
Join Date: Mar 2009
Location: Seattle, WA, USA
Posts: 62
Rep Power: 17
bjr is on a distinguished road
Send a message via AIM to bjr Send a message via Skype™ to bjr
Think I just got it figured out...

host2:~ # which ompi_info
/root/OpenFOAM/ThirdParty-1.6/openmpi-1.3.3/platforms/linux64GccDPOpt/bin/ompi_info
host2:~ # ompi_info | grep openib
MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.3)

Turns out it was in the Allwmake file.

The relevant lines being changed to (for my configuration)...

./configure \
--prefix=$MPI_ARCH_PATH \
--disable-mpirun-prefix-by-default \
--disable-orterun-prefix-by-default \
--enable-shared --disable-static \
--disable-mpi-f77 --disable-mpi-f90 --disable-mpi-cxx \
--disable-mpi-profile
# These lines enable Infiniband support
#--with-openib=/usr/local/ofed \
#--with-openib-libdir=/usr/local/ofed/lib64
--with-openib=/usr/include/infiniband
bjr is offline   Reply With Quote

Old   February 2, 2023, 22:12
Default I am getting similar issue that I am looking for help with on SLURM HPC using OF-5.x
  #8
New Member
 
Jonathan
Join Date: Sep 2022
Posts: 6
Rep Power: 4
jd01930 is on a distinguished road
Hello,

I am getting a similar error on my school HPC that uses a SLURM architecture instead.

The error is shown below...


No components were able to be opened in the pml framework.

This typically means that either no components of this type were
installed, or none of the installed components can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.

Host: cn484
Framework: pml
--------------------------------------------------------------------------
[cn484:2795285] PML ucx cannot be selected
[cn484:2795291] PML ucx cannot be selected
[cn484:2795275] 1 more process has sent help message help-mca-base.txt / find-available:none found
[cn484:2795275] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages








My batch script that I am attempting to submit is the following...


#!/bin/bash
#SBATCH --account=xiz14026
#SBATCH -J re6000 #job name
#SBATCH -o BFS_6000.o%j #output and error file name (%j expands to jobID
#SBATCH -e BFS_erro.%j
#SBATCH --partition=priority # allow 12 hours and parallel works
#SBATCH --constraint=epyc128
#SBATCH --ntasks=128
#SBATCH --nodes=1 # Ensure all cores are from whole nodes
#SBATCH --time=12:00:00


module purge

module load slurm
module load gcc/11.3.0
module load zlib/1.2.12
module load ucx/1.13.1
module load openmpi/4.1.4
module load boost/1.77.0
module load cmake/3.23.2
cd /home/jcd17002
source OF5x.env
cd /scratch/xiz14026/jcd17002/BFS_6000
#srun -n 192 --mpi=openmpi pimpleFoam -parallel> my_prog.out
mpirun -np 16 -x UCX_NET_DEVICES=mlx5_0:1 pimpleFoam -parallel > my_prog.out


to explain further I am trying to submit my case on an epyc128 node. I am using OpenFOAM-5.x and Openmpi4.1.4 (based on recommendation from people working at HPC)

all modules that I thought I should use are in my batch script. if There are any questions I can clarify further.


the file that I source to set env variables in the batch script is the following...


export SYS_MPI_HOME=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4
export SYS_MPI_ARCH_PATH=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4
export IROOT=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4
export MPI_ROOT=$IROOT




export FOAM_INST_DIR=/$HOME/OpenFOAM/
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$FORT_COM_LIB64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SYS_MPI_ARCH_PAT H/include
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SYS_MPI_ARCH_PAT H/lib #/apps2/openmpi/$mpi_version/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$FOAM_INST_DIR/ThirdParty/lib
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps2/intel/ips/2019u3/impi/2019.3.199/intel64/libfabric/lib
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps2/intel/ips/2019u3/impi/2019.3.199/intel64/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps/cgal/4.0.2/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps2/boost/1.77.0/lib
foamDotFile=$FOAM_INST_DIR/OpenFOAM-5.x/etc/bashrc
#[ -f $foamDotFile ] && . $foamDotFile
. $FOAM_INST_DIR/OpenFOAM-5.x/etc/bashrc
echo "Sourcing Bashrc"


#source $FOAM_INST_DIR/OpenFOAM-5.x/etc/config.sh/settings
echo "Done"
export OF_ENVIRONMENT_SET=TRUE
alias cds="cd /scratch/xiz14026/jcd17002/"
unset mpi_version
unset fort_com_version
echo "Done."
#echo ""





This environment file was specifically changed to help make the mpi paths used in my HPC's openmpi/4.1.4 module agree with the openFOAM settings.

I am wondering what kind of help anyone can offer regarding this issue. I am not expert with openmpi.

Thank you for your time and consideration.
jd01930 is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
udf error srihari FLUENT 1 October 31, 2016 15:18
MPIRUN fails lfbarcelo OpenFOAM 3 March 29, 2010 08:41
compile errors of boundary condition "expDirectionMixed" liying02ts OpenFOAM Bugs 2 February 1, 2010 21:11
MAxium residual...confusion with expert parameter KK CFX 3 February 8, 2008 11:47
Expert Parameter for compressible transient ioannis CFX 0 November 2, 2005 20:28


All times are GMT -4. The time now is 19:18.