CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Issues Running in Parallel on SLURM HPC

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   November 30, 2022, 18:12
Default Issues Running in Parallel on SLURM HPC
  #1
New Member
 
Jonathan
Join Date: Sep 2022
Posts: 6
Rep Power: 4
jd01930 is on a distinguished road
Hello,

I am having an issue with running a parallel simulation using OpenFOAM-5.x. My PI believes that it has something to do with the configuration / location of the mpi folders and dummy folders in my compilation.

I am trying to use the "srun" command to run my simulation using pimpleFoam in parallel using 8 nodes and 192 procs. I am using sbatch to submit the job in a SLURM architecture HPC.

My sbatch job script for a simplified case with 2 nodes each using 2 cpus is shown directly below .......

#!/bin/bash
#SBATCH -J re3000_classicSmag #job name
#SBATCH -o BFS_3000.o%j #output and error file name (%j expands to jobID
#SBATCH -e BFS_erro.%j
#SBATCH --exclusive #exclsuive mod
#SBATCH --partition=general # allow 12 hours and parallel works
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2 # total number of MPI tasks
#SBATCH --nodes=2 # Ensure all cores are from whole nodes
#SBATCH --time=00:10:00 # run time (hh:mm:ss) - 1.5 hours
#SBATCH --mail-type=end,begin # Event(s) that triggers email notification (BEGIN,END, requeue/fail)
#SBATCH --mail-user=jonathan.denman@uconn.edu # Destination email addres


cd
source OF5x.env
cd /home/jcd17002/OpenFOAM/jcd17002-5.x/run/channel395
#cd /scratch/xiz14026/jcd17002/3000Classic
#srun -n 192 --mpi=openmpi pimpleFoam -parallel> my_prog.out
srun \
--nodes=2 \
--ntasks-per-node=2 \
--mpi=openmpi \
pimpleFoam -parallel > my_prog.out


I have tried both of the srun commands shown above, both of them result in the following error message.



The error I am getting is the following.....

--> FOAM FATAL ERROR:
bool IPstream::init(int& argc, char**& argv) : attempt to run parallel on 1 processor

From function static bool Foam::UPstream::init(int &, char **&)
in file UPstream.C at line 91.

FOAM aborting

#0 Foam::error::printStack(Foam::Ostream&) in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/libOpenFOAM.so"
#1 Foam::error::abort() in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/libOpenFOAM.so"
#2 Foam::UPstream::init(int&, char**&) in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/intel64/libPstream.so"
#3 Foam::argList::argList(int&, char**&, bool, bool, bool) in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/libOpenFOAM.so"
#4 ?

--> FOAM FATAL ERROR:
bool IPstream::init(int& argc, char**& argv) : attempt to run parallel on 1 processor

From function static bool Foam::UPstream::init(int &, char **&)
in file UPstream.C at line 91.

FOAM aborting

#0 Foam::error::printStack(Foam::Ostream&) in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/bin/pimpleFoam"
#5 __libc_start_main in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/libOpenFOAM.so"
#1 Foam::error::abort() in "/lib64/libc.so.6"
#6 ? in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/bin/pimpleFoam"
in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/libOpenFOAM.so"
#2 Foam::UPstream::init(int&, char**&) in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/intel64/libPstream.so"
#3 Foam::argList::argList(int&, char**&, bool, bool, bool) in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/lib/libOpenFOAM.so"
#4 ? in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/bin/pimpleFoam"
#5 __libc_start_main in "/lib64/libc.so.6"
#6 ? in "/home/jcd17002/OpenFOAM/OpenFOAM-5.x/platforms/linux64IccDPInt32Opt/bin/pimpleFoam"
srun: error: cn135: task 0: Aborted (core dumped)
srun: Terminating job step 15817658.0
slurmstepd: *** STEP 15817658.0 ON cn135 CANCELLED AT 2022-11-30T15:28:27 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: cn136: tasks 2-3: Killed
srun: error: cn135: task 1: Aborted (core dumped)





I can use the mpirun command without error messages, but it is so slow that it seems like it is not running in parallel at all and is just running in serial.

I was also told that srun would run much faster and make more optimal use of this HPC architecture than mpirun.

Any help is greatly appreciated.
Attached Images
File Type: jpg sbatch script.jpg (76.7 KB, 34 views)
jd01930 is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Error running simpleFoam in parallel Yuby OpenFOAM Running, Solving & CFD 14 October 7, 2021 05:38
running mapFields in parallel mkhm OpenFOAM Pre-Processing 10 September 16, 2021 14:12
Running PBS job in parallel on HPC cluster silviliril OpenFOAM Running, Solving & CFD 11 August 9, 2019 12:50
[snappyHexMesh] Problem with boundaries with sHM in parallel running Loekatoni OpenFOAM Meshing & Mesh Conversion 0 January 24, 2019 08:56
Fluent 14.0 file not running in parallel mode in cluster tejakalva FLUENT 0 February 4, 2015 08:02


All times are GMT -4. The time now is 11:29.