September 16, 2012, 20:59
|
openFOAM mpirun error on cluster
|
#1
|
Member
Charlie
Join Date: Dec 2010
Location: USA
Posts: 85
Rep Power: 16
|
Hi, Foamers!
I've compiled the OpenFOAM on cluster, and during the compilation, I didn't receive any information, I used the third-party to compile every thing, using the gcc and openmpi-1.5.3 in the Third-party.
When I run a serial case (just use one processor) and there is no error, and the result looks good. However, When I try to use mpirun, I got the error message:
Quote:
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: trestles-login2.sdsc.edu
Local device: mlx4_0
--------------------------------------------------------------------------
/*---------------------------------------------------------------------------*\
| ========= | |
| \\ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \\ / O peration | Version: 2.1.0 |
| \\ / A nd | Web: www.OpenFOAM.org |
| \\/ M anipulation | |
\*---------------------------------------------------------------------------*/
Build : 2.1.0-bd7367f93311
Exec : interFoam -parallel
Date : Sep 16 2012
Time : 16:53:01
Host : "trestles-login2.sdsc.edu"
PID : 14007
Case : /xx/OpenFOAM/OpenFOAM-2.1.0/tutorials/multiphase/interFoam/ras/damBreak
nProcs : 4
Slaves :
3
(
"trestles-login2.sdsc.edu.14008"
"trestles-login2.sdsc.edu.14009"
"trestles-login2.sdsc.edu.14010"
)
Pstream initialized with:
floatTransfer : 0
nProcsSimpleSum : 0
commsType : nonBlocking
sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).
fileModificationChecking : Monitoring run-time modified files using timeStampMaster
allowSystemOperations : Disallowing user-supplied system call operations
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
Create time
Create mesh for time = 0
[0]
[0]
[0] --> FOAM FATAL IO ERROR:
[0] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token
[0]
[0] file: IOstream at line 0.
[0]
[0] From function IOstream::fatalCheck(const char*) const[2]
[3]
[3]
[3] --> FOAM FATAL IO ERROR:
[3] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token
[3]
[3] file: IOstream[1]
[1]
[1] --> FOAM FATAL IO ERROR:
[1] error in IOstream "IOstream" for operation operator>>(Istream&, List<T>&) : reading first token
[1]
[1] file: IOstream at line 0.
[1]
[1] From function IOstream::fatalCheck(const char*) const
[1] in file db/IOstreams/IOstreams/IOstream.C at line 114.
............
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 14007 on
node trestles-login2.sdsc.edu exiting improperly. There are two reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[trestles-login2.sdsc.edu:14006] 3 more processes have sent help message help-mpi-btl-openib.txt / error in device init
[trestles-login2.sdsc.edu:14006] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[trestles-login2.sdsc.edu:14006] 3 more processes have sent help message help-mpi-api.txt / mpi-abort
|
The question is what cause this problem? is it due to the openmpi that I'm using, or is it due to the configuration of the cluster? Thanks!!!!
|
|
|