|
[Sponsors] |
[cfMesh] Using cfMesh on HPC in parallel (with MPI) for large meshes - MPI_Bsend error |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
December 8, 2024, 06:18 |
Using cfMesh on HPC in parallel (with MPI) for large meshes - MPI_Bsend error
|
#1 |
New Member
Join Date: Feb 2023
Posts: 4
Rep Power: 3 |
Hello foamers,
I have been using cfMesh successfully for several years now (both on local machine and HPC environments), where I was creating medium size meshes without any problem whatsoever. This time however, I would like to create a large mesh with approx. 200M cells, which I would like to create on HPC environments - by using resources available across multiple nodes. As far as I understand, cfMesh by default uses all available CPU resources on a single node through Shared Memory Parallelization (SMP) - using the OpenMP - but I believe that is no longer sufficient as the meshing procedure in my case is quite slow using that approach. Therefore I would like to use the MPI parallelization using let's say 5 nodes, where each node has 128 cores. To do so, I did the following: 1) prepared the case as usual - FMS file preparation and specification of corresponding meshDict settings. 2) I ran the Code:
preparePar Code:
numberOfSubdomains 5; 3) I prepared the cartesianMesh_SLURM.sh script for the SLURM job scheduler. cartesianMesh_SLURM.sh: Code:
#!/bin/bash #SBATCH --nodes=5 # Total number of nodes requested #SBATCH --ntasks-per-node=1 # 1 MPI task per node #SBATCH --ntasks=5 #SBATCH --cpus-per-task=128 # Use all these cpus for intra-node parallelization handled by cfMesh by default #SBATCH --time=72:00:00 #SBATCH --mem=300000 #SBATCH --exclusive #SBATCH --contiguous #SBATCH --error=cartesianMesh.err #SBATCH --output=cartesianMesh.out # Load appropriate modules and make OpenFOAM available module load OpenFOAM/v2206 source $FOAM_BASH solver=cartesianMesh # Run cfMesh with hybrid MPI + OpenMP parallelization mpirun -np 5 cartesianMesh -parallel >> log.cartesianMesh log.cartesianMesh: Code:
/*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2206 | | \\ / A nd | Website: www.openfoam.com | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : _76d719d1e6-20220624 OPENFOAM=2206 version=v2206 Arch : "LSB;label=32;scalar=64" Exec : cartesianMesh -parallel Date : Dec 07 2024 Time : 19:28:30 Host : tcn941.local PID : 3685925 I/O : uncollated Case : <path_to_rootFolder> nProcs : 5 Hosts : ( (tcn941.local 1) (tcn942.local 1) (tcn943.local 1) (tcn944.local 1) (tcn945.local 1) ) Pstream initialized with: floatTransfer : 0 nProcsSimpleSum : 0 commsType : nonBlocking polling iterations : 0 trapFpe: Floating point exception trapping enabled (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 5, maxFileModificationPolls 20) allowSystemOperations : Allowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time Setting root cube size and refinement parameters Root box (-46192.4 -42195.6 -49707.5) (56207.6 60204.4 52692.5) Requested cell size corresponds to octree level 10 Refining boundary Refining boundary boxes to the given size Number of leaves per processor 1 Distributing leaves to processors Finished distributing leaves to processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Finished distributing load between processors Distributing load between processors Code:
[tcn942.local.snellius.surf.nl:3684931] pml_ucx.c:738 Error: bsend: failed to allocate buffer [tcn942.local.snellius.surf.nl:3684931] pml_ucx.c:882 Error: ucx send failed: No pending message [tcn942:3684931] *** An error occurred in MPI_Bsend [tcn942:3684931] *** reported by process [3133276161,1] [tcn942:3684931] *** on communicator MPI_COMM_WORLD [tcn942:3684931] *** MPI_ERR_OTHER: known error not in list [tcn942:3684931] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [tcn942:3684931] *** and potentially your MPI job) Therefore I also tried with adding the following piece of code into the SLURM script before executing cartesianMesh, but it was without any success as well. Code:
# Set OpenMP environment variable # => number of threads OpenMP will use for shared-memory parallelism on intra-node level export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK,$SLURM_CPUS_PER_TASK # 128,128 # Set MPI buffer size export MPI_BUFFER_SIZE=200000000 I tried searching for a way of solving these MPI_Bsend/UCX issues, but unfortunately I didn't manage to understand what to do. Does anyone have any idea, or suggestion to point out what am I actually doing wrong and how could one make using this amazing meshing tool on clusters with cfMesh using MPI and CPU resources distributed across multiple nodes for large meshes? I would appreciate a lot any insights. Thank you so much! |
|
Tags |
cartesianmesh, cfmesh, hpc cluster, mpi, mpi errors |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
long error when using make-install SU2_AD. | tomp1993 | SU2 Installation | 3 | March 17, 2018 07:25 |
[swak4Foam] GroovyBC the dynamic cousin of funkySetFields that lives on the suburb of the mesh | gschaider | OpenFOAM Community Contributions | 300 | October 29, 2014 19:00 |
Compile problem | ivanyao | OpenFOAM Running, Solving & CFD | 1 | October 12, 2012 10:31 |
CGNS lib and Fortran compiler | manaliac | Main CFD Forum | 2 | November 29, 2010 07:25 |
How to get the max value of the whole field | waynezw0618 | OpenFOAM Running, Solving & CFD | 4 | June 17, 2008 06:07 |