CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Main CFD Forum

Errors with openmpi/4.1.4 on Slurm HPC OF-5.x

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 2, 2023, 22:29
Question Errors with openmpi/4.1.4 on Slurm HPC OF-5.x
  #1
New Member
 
Jonathan
Join Date: Sep 2022
Posts: 6
Rep Power: 4
jd01930 is on a distinguished road
Hello,

My name is Jonathan, and I am a student working on my master's thesis working with cfd with aerospace application.



I am getting a similar error on my school HPC that uses a SLURM architecture instead. I am starting my runs on new epyc128 nodes.

The error is shown below...


No components were able to be opened in the pml framework.

This typically means that either no components of this type were
installed, or none of the installed components can be loaded.
Sometimes this means that shared libraries required by these
components are unable to be found/loaded.

Host: cn484
Framework: pml
--------------------------------------------------------------------------
[cn484:2795285] PML ucx cannot be selected
[cn484:2795291] PML ucx cannot be selected
[cn484:2795275] 1 more process has sent help message help-mca-base.txt / find-available:none found
[cn484:2795275] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages




No matter what I try to change I get the above message with multiple or a single error in the bottom corresponding to the line that says "PML ucx cannot be selected" in the last few lines of the error message.

The HPC center employees seem to think that my OF5.x env file is conflicting with the environmental variables that are loaded in the other modules in my batch script. I think that unless I am mistaken there is no way I can successfully launch an OF simulation without setting the environment. I am wondering about suggestions or any assistance with troubleshooting anyone is offering for this issue as I cannot find much to go off of besides the post in this link.






My batch script that I am attempting to submit is the following...


#!/bin/bash
#SBATCH --account=xiz14026
#SBATCH -J re6000 #job name
#SBATCH -o BFS_6000.o%j #output and error file name (%j expands to jobID
#SBATCH -e BFS_erro.%j
#SBATCH --partition=priority # allow 12 hours and parallel works
#SBATCH --constraint=epyc128
#SBATCH --ntasks=128
#SBATCH --nodes=1 # Ensure all cores are from whole nodes
#SBATCH --time=12:00:00


module purge

module load slurm
module load gcc/11.3.0
module load zlib/1.2.12
module load ucx/1.13.1
module load openmpi/4.1.4
module load boost/1.77.0
module load cmake/3.23.2
cd /home/jcd17002
source OF5x.env
cd /scratch/xiz14026/jcd17002/BFS_6000
#srun -n 192 --mpi=openmpi pimpleFoam -parallel> my_prog.out
mpirun -np 16 -x UCX_NET_DEVICES=mlx5_0:1 pimpleFoam -parallel > my_prog.out


to explain further I am trying to submit my case on an epyc128 node. I am using OpenFOAM-5.x and Openmpi4.1.4 (based on recommendation from people working at HPC)

all modules that I thought I should use are in my batch script. if There are any questions I can clarify further.


the file that I source to set env variables in the batch script is the following...


export SYS_MPI_HOME=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4
export SYS_MPI_ARCH_PATH=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4
export IROOT=/gpfs/sharedfs1/admin/hpc2.0/apps/openmpi/4.1.4
export MPI_ROOT=$IROOT




export FOAM_INST_DIR=/$HOME/OpenFOAM/
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$FORT_COM_LIB64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SYS_MPI_ARCH_PAT H/include
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SYS_MPI_ARCH_PAT H/lib #/apps2/openmpi/$mpi_version/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$FOAM_INST_DIR/ThirdParty/lib
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps2/intel/ips/2019u3/impi/2019.3.199/intel64/libfabric/lib
#export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps2/intel/ips/2019u3/impi/2019.3.199/intel64/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps/cgal/4.0.2/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps2/boost/1.77.0/lib
foamDotFile=$FOAM_INST_DIR/OpenFOAM-5.x/etc/bashrc
#[ -f $foamDotFile ] && . $foamDotFile
. $FOAM_INST_DIR/OpenFOAM-5.x/etc/bashrc
echo "Sourcing Bashrc"


#source $FOAM_INST_DIR/OpenFOAM-5.x/etc/config.sh/settings
echo "Done"
export OF_ENVIRONMENT_SET=TRUE
alias cds="cd /scratch/xiz14026/jcd17002/"
unset mpi_version
unset fort_com_version
echo "Done."
#echo ""





This environment file was specifically changed to help make the mpi paths used in my HPC's openmpi/4.1.4 module agree with the openFOAM settings.

I am wondering what kind of help anyone can offer regarding this issue. I am not expert with openmpi module.

Thank you for your time and consideration.
jd01930 is offline   Reply With Quote

Reply

Tags
mpi errors, openfaom-5, openmpi 4


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Building OpenFOAM1.7.0 from source ata OpenFOAM Installation 46 March 6, 2022 14:21
pimpleDyMFoam computation randomly stops babapeti OpenFOAM Running, Solving & CFD 5 January 24, 2018 06:28
Floating point exception error lpz_michele OpenFOAM Running, Solving & CFD 53 October 19, 2015 03:50
Upgraded from Karmic Koala 9.10 to Lucid Lynx10.04.3 bookie56 OpenFOAM Installation 8 August 13, 2011 05:03
Could anybody help me see this error and give help liugx212 OpenFOAM Running, Solving & CFD 3 January 4, 2006 19:07


All times are GMT -4. The time now is 03:53.