|
[Sponsors] |
September 3, 2007, 18:15 |
Has anyone seen this kind of e
|
#1 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Has anyone seen this[1] kind of error message before? I've linked OpenFOAM 1.4.1 with MPICH-compatibility libraries provided by the HP-MPI suite. I first tried linking directly with hpmpi.so and Pstream compiled fine. However, after that whenever I tried compiling any solver, I got the very same error messages that Frank Bos reported a while ago[2]. I believe those errors are due to C++ bindings being enabled when building Pstream. However, I don't know how to disable them when using HP-MPI. As a result I had to switch to MPICH-compatibility libraries provided by HP-MPI which allow me to build both Pstream and my solver without problems. I need to use HP-MPI as the cluster is configured for Voltaire Infiniband switched-fabric interconnect with Hewlett Packard's XC software stack.
Now, in MPICH-compatibility mode, ldd `which icoFoam_1` gives: [madhavan@matrix ~]$ ldd `which icoFoam_1` libfiniteVolume.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume .so (0x0000002a95557000) libOpenFOAM.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so (0x0000002a96158000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003b8f600000) libstdc++.so.6 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libstdc++.so.6 (0x0000002a96604000) libm.so.6 => /lib64/tls/libm.so.6 (0x0000003b8f100000) libgcc_s.so.1 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libgcc_s.so.1 (0x0000002a96829000) libc.so.6 => /lib64/tls/libc.so.6 (0x0000003b8f300000) libPstream.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libPstrea m.so (0x0000002a96937000) libtriSurface.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libtriSurface.s o (0x0000002a96a3f000) libmeshTools.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libmeshTools.so (0x0000002a96bbf000) libz.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libz.so (0x0000002a96e2d000) /lib64/ld-linux-x86-64.so.2 (0x0000003b8ef00000) libmpich.so => /opt/hpmpi/MPICH1.2/lib/linux_amd64/libmpich.so (0x0000002a96f42000) librt.so.1 => /lib64/tls/librt.so.1 (0x0000003b94000000) liblagrangian.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/liblagrangian.s o (0x0000002a9706e000) libhpmpi.so => /opt/hpmpi/MPICH1.2/lib/linux_amd64/libhpmpi.so (0x0000002a97170000) libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000003b8fa00000) [madhavan@matrix ~]$ Interestingly, if I run a case which is only around 1-2 million cells the run executes perfectly. But it does not for a 6 or 9 million cell case. Which suggests that this has something to do with 32 or 64 bit build issues. Nevertheless, the output from ldd (shown above) seems to suggest the contrary. I would appreciate if someone could shed some light on this issue. Thanks! [1] Error message from MPI: [12] [12] [12] --> FOAM FATAL IO ERROR : IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>> (Istream&, List<t>&) : reading first token [12] [12] file: IOstream at line 0. [12] [12] From function IOstream::fatalCheck(const char* operation) const [12] in file db/IOstreams/IOstreams/IOcheck.C at line 73. [12] FOAM parallel run exiting [12] [11] [11] [11] --> FOAM FATAL IO ERROR : IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>> (Istream&, List<t>&) : reading first token [11] [11] file: IOstream at line 0. [11] [11] From function IOstream::fatalCheck(const char* operation) const [11] in file db/IOstreams/IOstreams/IOcheck.C at line 73. [11] FOAM parallel run exiting [11] [10] [10] [10] --> FOAM FATAL IO ERROR : IOstream::check(const char* operation) : error in IOstream "IOstream" for operation operator>> (Istream&, List<t>&) : reading first token [10] [10] file: IOstream at line 0. [10] [10] From function IOstream::fatalCheck(const char* operation) const [10] in file db/IOstreams/IOstreams/IOcheck.C at line 73. [10] FOAM parallel run exiting [10] MPI Application rank 12 exited before MPI_Finalize() with status 1 MPI Application rank 11 exited before MPI_Finalize() with status 1 [2] http://www.cfd-online.com/OpenFOAM_D...es/1/2968.html |
|
September 4, 2007, 14:51 |
In LAM&openMPI I just had a lo
|
#2 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
In LAM&openMPI I just had a look through the mpi.h to see how to not include the c++ bindings. Maybe there is a similar nice switch in HP-MPI.
Alternatively additionally link in the mpi library that provides the c++ functions. |
|
September 4, 2007, 15:35 |
Hi Mattijs,
I looked into m
|
#3 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Hi Mattijs,
I looked into mpi.h and found no such convenient switch. However I located the additional link that provides C++ functions (-lmpiCC). When I add this to my mplibHPMPI rule, the build proceeds fine except for this error message near the end: /usr/bin/ld: /opt/hpmpi/lib/linux_amd64/libmpiCC.a(intercepts.o): relocation R_X86_64_32S against `MPI::Comm::key_ref_map' can not be used when making a shared object; recompile with -fPIC /opt/hpmpi/lib/linux_amd64/libmpiCC.a: could not read symbols: Bad value collect2: ld returned 1 exit status make: *** [/home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libPstre am.so] Error 1 How should I proceed? Thanks for your help! |
|
September 4, 2007, 16:31 |
Mattijs, Thanks very much for
|
#4 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Mattijs, Thanks very much for your inspiration which made me fool around a lot with HP-MPI. I think I have finally solved the problem. Here is the detailed solution in the hopes that it will prove useful for others in a similar predicament.
First I need to poke into mpi.h as Mattijs suggested and find out if they have an easy switch I can pass through PFLAGS in ~/OpenFOAM/OpenFOAM-1.4.1/wmake/rules/linux64Gcc/mplibHPMPI which by the way is the file you create when you try to build Pstream with an MPI implementation already installed on the cluster. Note: I also added the following lines to ~/OpenFOAM/OpenFOAM-1.4.1/.bashrc: elif [ .$WM_MPLIB = .HPMPI ]; then export HPMPI_ARCH_PATH=/opt/hpmpi AddLib $HPMPI_ARCH_PATH/lib/linux_amd64/ AddPath $HPMPI_ARCH_PATH/bin export FOAM_MPI_LIBBIN=$FOAM_LIBBIN/hpmpi and export WM_MPLIB=HPMPI to ~/OpenFOAM/OpenFOAM-1.4.1/.OpenFOAM-1.4.1/bashrc Finally, update the Allwmake script in ~/OpenFOAM/OpenFOAM-1.4.1/src/Pstream to include "$WM_MPLIB" = "HPMPI" Currently for HP-MPI the mplibHPMPI file reads: PFLAGS = -DHPMP_BUILD_CXXBINDING PINC = -I/opt/hpmpi/include PLIBS = -L/opt/hpmpi/lib/linux_amd64 -lhpmpio -lhpmpi -ldl -lmpiCC As one can see, I have added the -DHPMP_BUILD_CXXBINDING switch to PFLAGS as I found that doing so enables C++ bindings support within HP-MPI. In addition, I also added the -lmpiCC to link the libraries with C++ MPI bindings. When I tried to build Pstream, it failed with the error message mentioned above in this thread (i.e. relocation error). This is caused by mixing static libraries with shared builds. The solution for the same is to try and find a libmpiCC.so in the HP-MPI installation. I could not find one. So I googled for the same and came up with an alternative proposed by HP[1]. This let me rebuild libmpiCC.a using my current g++ (supplied with OpenFOAM). However the library was still static. So I googled again on how to create shared libraries and found this link[2]. Now all I had to do was follow the recipe: g++ -fPIC -c intercepts.cc -I/opt/hpmpi/include -DHPMP_BUILD_CXXBINDING g++ -fPIC -c mpicxx.cc -I/opt/hpmpi/include -DHPMP_BUILD_CXXBINDING g++ -shared -Wl,-soname,libmpiCC.so -o libmpiCC.so.1.0.1 intercepts.o mpicxx.o -lc And finally symlink the libmpiCC.so.1.0.1 to ~/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libmpiCC.so Now, my mplibHPMPI file reads: PFLAGS = -DHPMP_BUILD_CXXBINDING PINC = -I/opt/hpmpi/include PLIBS = -L /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/src/mpiCCsrc -L/opt/hpmpi/lib/linux_amd64 -lhpmpio -lhpmpi -ldl -lmpiCC And after rebuilding libPstream.so followed by icoFoam_1 (my customized icoFoam solver), ldd `which icoFoam_1` gives: [madhavan@matrix icoFoam]$ ldd `which icoFoam_1` libfiniteVolume.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume .so (0x0000002a95557000) libOpenFOAM.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so (0x0000002a96158000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003b8f600000) libstdc++.so.6 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libstdc++.so.6 (0x0000002a96604000) libm.so.6 => /lib64/tls/libm.so.6 (0x0000003b8f100000) libgcc_s.so.1 => /home/users/madhavan/OpenFOAM/linux64/gcc-4.2.1/lib64/libgcc_s.so.1 (0x0000002a96829000) libc.so.6 => /lib64/tls/libc.so.6 (0x0000003b8f300000) libPstream.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libPstrea m.so (0x0000002a96937000) libtriSurface.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libtriSurface.s o (0x0000002a96a4f000) libmeshTools.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libmeshTools.so (0x0000002a96bcf000) libz.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libz.so (0x0000002a96e3d000) /lib64/ld-linux-x86-64.so.2 (0x0000003b8ef00000) libmpio.so.1 => /opt/hpmpi/lib/linux_amd64/libmpio.so.1 (0x0000002a96f52000) libmpi.so.1 => /opt/hpmpi/lib/linux_amd64/libmpi.so.1 (0x0000002a9708d000) libmpiCC.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/hpmpi/libmpiCC. so (0x0000002a972c8000) liblagrangian.so => /home/users/madhavan/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/liblagrangian.s o (0x0000002a973e4000) References: [1] http://docs.hp.com/en/B6060-96024/ch03s02.html [2] http://tldp.org/HOWTO/Program-Librar...libraries.html |
|
September 4, 2007, 16:38 |
Addendum: One might wish to ad
|
#5 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Addendum: One might wish to add the -m64 flag to the g++ command line just to be safe.
|
|
September 4, 2007, 19:50 |
Alright, I give up! Even after
|
#6 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Alright, I give up! Even after successfully building the application with HP-MPI support, I get the same error message when running a 6-million case. I'm reverting to OpenMPI 1.2.3 for good. If there is one thing I've learnt through this ordeal it is that proprietary software is "EVIL" by design.
|
|
October 30, 2007, 09:55 |
Follow these instructions to g
|
#7 |
Senior Member
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 21 |
Follow these instructions to get HPMPI working:
http://openfoamwiki.net/index.php/HowTo_Pstream Thanks to Henry and Mattijs for the work-around. |
|
October 30, 2007, 23:19 |
Thanks a lot Eugene for the in
|
#8 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Thanks a lot Eugene for the info. Thanks of course to Henry and Mattijs as well. It certainly works. But I will need to check if I can run cases with 4-6 million cells without issues.
|
|
October 31, 2007, 11:05 |
Yes, please let me know. My co
|
#9 |
Senior Member
Eugene de Villiers
Join Date: Mar 2009
Posts: 725
Rep Power: 21 |
Yes, please let me know. My connection to the machine I was about to do the tests on has gone down so I have no way of confirming that the fix solves the 6M cell problem as well.
|
|
November 9, 2007, 23:17 |
Apologies for the late respons
|
#10 |
Senior Member
Srinath Madhavan (a.k.a pUl|)
Join Date: Mar 2009
Location: Edmonton, AB, Canada
Posts: 703
Rep Power: 21 |
Apologies for the late response Eugene. HPMPI works very nicely for large cases as well using the instructions you pointed to earlier. Thanks Henry and Mattijs!
|
|
August 5, 2009, 17:07 |
Still there is a problem
|
#11 |
New Member
Alireza Mahdavifar
Join Date: Jul 2009
Location: Kingston, ON, Canada
Posts: 4
Rep Power: 17 |
As you may know in OpenFOAM-1.5-dev and OpenFOAM-1.6, the file mplibHPMPI has beed added to wmake/rules/$WM_ARCH directory to support HP-MPI and it includes instructions that eugene has linked. I have copmiled Pstream using that settings (mplibHPMPI) but still I get the same error that msrinath80 introduced, for a 3 million gridpoints mesh (or higher) on more than 4 CPUs (1 node).
Last edited by ali84; August 6, 2009 at 02:15. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
strange fluent error: Primitive Error at Node 0 | Jean-Baptiste | FLUENT | 8 | June 14, 2022 12:11 |
error in COMSOL:'ERROR:6164 Duplicate Variable' | bhushas | COMSOL | 1 | May 30, 2008 05:35 |
"Error: Floating point error: invalid number" | MI Kim | FLUENT | 2 | January 4, 2007 11:00 |
Fatal error error writing to tmp No space left on device | maka | OpenFOAM Installation | 2 | April 3, 2006 09:48 |
Error: Floating point error: invalid number | Bob | FLUENT | 3 | June 3, 2005 19:11 |