|
[Sponsors] |
Serial OK parallel failsmesh conversion problem |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
September 27, 2007, 09:28 |
Hi All,
I am having a proble
|
#1 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Hi All,
I am having a problem running a case of mine in parallel whilst the serial version is all fine (so far). The mesh is imported from fluent (with the new fluent3DMeshToaFoam utility) and has an internal wall. As I said, this doesnt seem to bother much the run in serial, but after decomposing it the run invariably finishes with an MPI error message like: Create mesh for time = 0 [oct11:07921] *** An error occurred in MPI_Recv [oct11:07921] *** on communicator MPI_COMM_WORLD [oct11:07921] *** MPI_ERR_TRUNCATE: message truncated [oct11:07921] *** MPI_ERRORS_ARE_FATAL (goodbye) [1] [1] [1] --> FOAM FATAL IO ERROR : Expected a ')' or a '}' while reading List, found on line 0 an error [1] [1] file: IOstream at line 0. [1] [1] From function Istream::readEndList(const char*) [1] in file db/IOstreams/IOstreams/Istream.C at line 159. [1] FOAM parallel run exiting [1] [0] ?? in "/lib/libc.so.6" [2] #3 ?? at pml_ob1_recvfrag.c:0 [2] #4 mca_btl_sm_component_progress in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/openmpi/mca_btl_sm.so" [2] #5 mca_bml_r2_progress in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/openmpi/mca_bml_r2.so" [2] #6 opal_progress in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/libopen-pal.so.0" [2] #7 mca_pml_ob1_probe in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/openmpi/mca_pml_ob1.so" [2] #8 MPI_Probe in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/ lib/libmpi.so.0" [2] #9 Foam::IPstream::IPstream(int, int, Foam::IOstream::streamFormat, Foam::IOstream::versionNumber) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/openmpi-1.2.3/libPstream .so" [2] #10 Foam::globalPoints::receivePatchPoints(Foam::HashS et<int,> >&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so" [2] #11 Foam::globalPoints::globalPoints(Foam::polyMesh const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so" [2] #12 Foam::globalMeshData::updateMesh() in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so" [2] #13 Foam::globalMeshData::globalMeshData(Foam::polyMes h const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so" [2] #14 Foam::polyMesh::globalData() const in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so" [2] #15 Foam::polyMesh::polyMesh(Foam::IOobject const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so" [2] #16 Foam::fvMesh::fvMesh(Foam::IOobject const&) in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume.so" [2] #17 main in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam" [2] #18 __libc_start_main in "/lib/libc.so.6" [2] #19 Foam::regIOobject::readIfModified() in "/home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam" [oct11:07971] *** Process received signal *** [oct11:07971] Signal: Segmentation fault (11) [oct11:07971] Signal code: (-6) [oct11:07971] Failing at address: 0x47300001f23 [oct11:07971] [ 0] /lib/libc.so.6 [0x2aaaac61c110] [oct11:07971] [ 1] /lib/libc.so.6(gsignal+0x3b) [0x2aaaac61c07b] [oct11:07971] [ 2] /lib/libc.so.6 [0x2aaaac61c110] [oct11:07971] [ 3] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_pml_ob1.so [0x2aaab26b8c17] [oct11:07971] [ 4] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0x1db) [0x2aaab2cd07cb] [oct11:07971] [ 5] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x2a) [0x2aaab28c426a] [oct11:07971] [ 6] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/libopen-pal.so.0(opal_progress+0x4a) [0x2aaaad93495a] [oct11:07971] [ 7] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/openmpi/mca_pml_ob1.so(mca_pml_ob1_probe+0x3c5) [0x2aaab26b61a5] [oct11:07971] [ 8] /home/radu/OpenFOAM/OpenFOAM-1.4.1/src/openmpi-1.2.3/platforms/linux64GccDPOpt/l ib/libmpi.so.0(MPI_Probe+0xf6) [0x2aaaad28fda6] [oct11:07971] [ 9] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/openmpi-1.2.3/libPstream. so(_ZN4Foam8IPstreamC1EiiNS_8IOstream12streamForma tENS1_13versionNumberE+0xee) [0x2aaaac82f24e] [oct11:07971] [10] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam12 globalPoints18receivePatchPointsERNS_7HashSetIiNS_ 4HashIiEEEE+0x22c) [0x2aaaababc50c] [oct11:07971] [11] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam12 globalPointsC1ERKNS_8polyMeshE+0x24f) [0x2aaaababccaf] [oct11:07971] [12] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam14 globalMeshData10updateMeshEv+0x110) [0x2aaaabaae890] [oct11:07971] [13] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam14 globalMeshDataC1ERKNS_8polyMeshE+0xe4) [0x2aaaabaaff64] [oct11:07971] [14] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam8 polyMesh10globalDataEv+0x55) [0x2aaaabad07f5] [oct11:07971] [15] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZN4Foam8p olyMeshC2ERKNS_8IOobjectE+0x1c02) [0x2aaaabad6f12] [oct11:07971] [16] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume.so(_ZN4Fo am6fvMeshC1ERKNS_8IOobjectE+0x19) [0x2aaaaae3cae9] [oct11:07971] [17] /home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam [0x412e07] [oct11:07971] [18] /lib/libc.so.6(__libc_start_main+0xda) [0x2aaaac6094ca] [oct11:07971] [19] /home/radu/OpenFOAM/OpenFOAM-1.4.1/applications/bin/linux64GccDPOpt/icoFoam(_ZN4 Foam11regIOobject14readIfModifiedEv+0x1a9) [0x412979] [oct11:07971] *** End of error message *** mpirun noticed that job rank 0 with PID 7969 on node oct11 exited on signal 15 (Terminated). 3 additional processes aborted (not shown checkMesh does say: Mesh stats points: 4079936 edges: 12145862 faces: 12129187 internal faces: 11788349 cells: 3986256 boundary patches: 4 point zones: 0 face zones: 0 cell zones: 3 Number of cells of each type: hexahedra: 3986256 prisms: 0 wedges: 0 pyramids: 0 tet wedges: 0 tetrahedra: 0 polyhedra: 0 Checking topology... Boundary definition OK. Point usage OK. Upper triangular ordering OK. Topological cell zip-up check OK. Face vertices OK. Number of identical duplicate faces (baffle faces): 77004 Face-face connectivity OK. Number of regions: 1 (OK). Checking patch topology for multiply connected surfaces ... Patch Faces Points Surface pared 182407 182695 ok (not multiply connected) inflow_top_lid 1836 1963 ok (not multiply connected) outflow_top_lid 2587 2750 ok (not multiply connected) pared_interior 154008 77742 multiply connected surface (shared edge) <<Writing 77718 conflicting points to set nonManifoldPoints Checking geometry... Domain bounding box: (-0.04 -0.04 -1.42109e-17) (0.04 0.04 0.08) Boundary openness (-2.89631e-16 -8.26677e-16 -8.0705e-16) OK. Max cell openness = 8.55581e-16 OK. Max aspect ratio = 323.357 OK. Minumum face area = 8.39926e-10. Maximum face area = 8.16213e-06. Face area magnitudes OK. Min volume = 6.33172e-14. Max volume = 1.06746e-08. Total volume = 0.000402107. Cell volumes OK. Mesh non-orthogonality Max: 32.6604 average: 5.2825 Non-orthogonality check OK. Face pyramids OK. Max skewness = 0.594768 OK. Min/max edge length = 2.04497e-05 0.00509539 OK. All angles in faces OK. Face flatness (1 = flat, 0 = butterfly) : average = 1 min = 0.999999 All face flatness OK. Mesh OK. Is that multiply connected surface (the internal wall) that is causing the trouble? Or should I look elsewhere? I am saying this because I did the import with the old "fluentMeshToFoam" and used the procedure described by Bernhard for "mesh with internal walls" and, having two patches instead of one did get rid of these "multiply connected surfaces" label, but the parallel run failed again. Sorry for the long post... Cheers, Radu |
|
September 27, 2007, 10:18 |
How did you decompose the mesh
|
#2 |
Senior Member
Francesco Del Citto
Join Date: Mar 2009
Location: Zürich Area, Switzerland
Posts: 237
Rep Power: 18 |
How did you decompose the mesh?
Are you using the same OF version for decomposing and running? Try checkMesh in parallel, but I guess it returns the same error... |
|
September 27, 2007, 11:06 |
Running decomposePar with the
|
#3 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Running decomposePar with the "simple" option as the 3D mesh is just a 2D one replicated in the third direction a certain number of times.
And yes I used the same OF1.1.4 for decomposing and running, and previously did an ./Allwmake in ~/OpenFOAM/OpenFOAM-1.4.1/applications/utilities/ parallelProcessing, just in case. However, in the meantime I wiped a part of the mesh of one side of that internal wall patch, so now it became a normal boundary patch of type wall, and did the whole process of importing the mesh etc, etc...and IT WORKED!...both serial and parallel. So my guess is that the multiply connected face, or the two faces of zero "depth" that were created following Bernhard´s procedure make the difference in some stage of the parallel run process. Did someone encounter the same problem, or am I just rubbish/sluggish somewhere in the way? Radu |
|
September 27, 2007, 13:42 |
Can you check that both sides
|
#4 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
Can you check that both sides of all processor patches have the same number of points? This is a requirement for valid meshes.
Just run checkMesh on all the domains and look at the patch statistics for procBoundary_xxToyy v.s. procBoundary_yyToxx. |
|
September 28, 2007, 05:14 |
Hi Mattijs,
I did that check
|
#5 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Hi Mattijs,
I did that check and forgot to mention it in the post. And yes, the stats say that they do have the same number of points and faces either way of the processor boundaries. Furthermore,the patch with multiply connected faces lies inside one of the processor domain, some cells away from the processor boundary. LASTMINUTE: ..I changed the decomposition method from simple to metis with the same weight on all processes and to my surprise it works fine..as in it runs with no MPI failure. So I guess that I will stick with this to get some results. In the meantime will try to understand what went wrong before (truth is that I simply don´t think I will, cause I don´t see anything wrong, for what I know) Cheers anyway, Radu |
|
September 28, 2007, 06:42 |
Can you post the case or send
|
#6 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
Can you post the case or send it to me? (m.janssens)
|
|
September 28, 2007, 07:43 |
How can I upload a case? Never
|
#7 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
How can I upload a case? Never done that....mesh file from Gambit is large (~800M)...
|
|
September 28, 2007, 07:52 |
try to load the 0 and constant
|
#8 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
try to load the 0 and constant dirs...system should be like the one of e.g. icoFoam/cavity and run with icoFoam...
|
|
September 28, 2007, 08:02 |
Well... I now know how to do i
|
#9 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Well... I now know how to do it,but of course it complains about the size...and it fails...
|
|
September 28, 2007, 10:43 |
There's a 50K limit on this fo
|
#10 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
There's a 50K limit on this forum.
Have any smaller case that has the problem? Or cut out the bits that give problems perhaps? - set your startTime to latestTime - for all domains pick up the cells using any point on the boundary: setSet . processorXXX faceSet f0 new boundaryToFace pointSet p0 new faceToPoint f0 all cellSet c0 new pointToCell p0 any - subset the c0 part of the mesh: subsetMesh <root> <case> c0 - pack up the subsetted meshes (there will be new time directories with a polyMesh inside) |
|
September 28, 2007, 11:03 |
Did what you said, but the tgz
|
#11 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Did what you said, but the tgz of one of them polyMesh directories "weights" still some 9M. Will do a smaller case and check.
Thank you for your effort and I will let you know as soon as I get something. Probably monday... Cheers, Radu |
|
October 1, 2007, 05:42 |
Well, well..I did a smaller ca
|
#12 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Well, well..I did a smaller case, but then everything was fine so no luck in catching the fault. However, on the big case (the one with metis decomp.), the run failed at some time when a dump of data had to be done..and gave me some errors like:
[oct11:08555] *** Process received signal *** [oct11:08555] Signal: Bus error (7) [oct11:08555] Signal code: (2) [oct11:08555] Failing at address: 0x2aaaab04be10 [oct11:08555] [ 0] /lib/libc.so.6 [0x2aaaac98b110] [oct11:08555] [ 1] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libfiniteVolume.so(_ZNK4F oam20coupledFvsPatchFieldIdE5writeERNS_7OstreamE+0 ) [0x2aaaab04be10] [oct11:08555] [ 2] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (_ZNK4Foam14GeometricFieldIdNS_13fvsPatchFieldENS_ 11surfaceMeshEE22GeometricBoun daryField10writeEntryERKNS_4wordERNS_7OstreamE+0x1 2b) [0x42573b] [oct11:08555] [ 3] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (_ZN4FoamlsIdNS_13fvsPatchFieldENS_11surfaceMeshEE ERNS_7OstreamES4_RKNS_14Geomet ricFieldIT_T0_T1_EE+0x1d4) [0x43a344] [oct11:08555] [ 4] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (_ZNK4Foam14GeometricFieldIdNS_13fvsPatchFieldENS_ 11surfaceMeshEE9writeDataERNS_ 7OstreamE+0xf) [0x43a3ef] [oct11:08555] [ 5] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 1regIOobject11writeObjectENS_8IOstream12streamForm atENS1_13versionNumberENS1_15c ompressionTypeE+0x263) [0x2aaaabd69e03] [oct11:08555] [ 6] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 4objectRegistry11writeObjectENS_8IOstream12streamF ormatENS1_13versionNumberENS1_ 15compressionTypeE+0x93) [0x2aaaabd6dc63] [oct11:08555] [ 7] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 4objectRegistry11writeObjectENS_8IOstream12streamF ormatENS1_13versionNumberENS1_ 15compressionTypeE+0x93) [0x2aaaabd6dc63] [oct11:08555] [ 8] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam4 Time11writeObjectENS_8IOstream12streamFormatENS1_1 3versionNumberENS1_15compressi onTypeE+0x3ab) [0x2aaaabd7ffdb] [oct11:08555] [ 9] /home/radu/OpenFOAM/OpenFOAM-1.4.1/lib/linux64GccDPOpt/libOpenFOAM.so(_ZNK4Foam1 1regIOobject5writeEv+0x4f) [0x2aaaabd69b7f] [oct11:08555] [10] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam [0x416b0e] [oct11:08555] [11] /lib/libc.so.6(__libc_start_main+0xda) [0x2aaaac9784ca] [oct11:08555] [12] /home/radu/OpenFOAM/radu-1.4.1/applications/bin/linux64GccDPOpt/porosoSteadyFoam (__gxx_personality_v0+0xda) [0x412a4a] Rings a bell to anyone? So...I guess that something´s wrong in the cluster setup, right? Have to contact the Admin. Cheers, Radu |
|
October 2, 2007, 14:05 |
Something seems to be still wr
|
#13 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
Something seems to be still wrong on your processor patches ...
Try running with the following environment variables set: # Initialise blocks of memory to NaN FOAM_SETNAN # Abort instead of exit FOAM_ABORT # Exit if NaN encountered FOAM_SIGFPE and possibly under valgrind. |
|
October 5, 2007, 05:21 |
Mattijs,
Could not advance an
|
#14 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Mattijs,
Could not advance anything into the problem so far, cause I had so many admin. things to do, but I will and will let you know. Radu |
|
October 26, 2007, 07:39 |
Hi all,
Finally I gave up se
|
#15 |
Member
Radu Mustata
Join Date: Mar 2009
Location: Zaragoza, Spain
Posts: 99
Rep Power: 17 |
Hi all,
Finally I gave up searching for the problem in any of decomposePar and friends. And, surprisingly enough, now it seems to work fine. That is after I removed some exports in my bashrc...probably something tampered with my OF install. That´s that then. Cheers, Radu |
|
July 31, 2008, 16:04 |
I believe that increasing the
|
#16 |
Member
David P. Schmidt
Join Date: Mar 2009
Posts: 72
Rep Power: 17 |
I believe that increasing the environment variable:
MPI_BUFFER_SIZE may help. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
More DPM incompletes in parallel than in serial | Paul | FLUENT | 0 | December 16, 2008 10:27 |
Serial vs parallel different results | luca | OpenFOAM Bugs | 2 | December 3, 2008 11:12 |
Problem with Parallel not with Serial | iyer_arvind | OpenFOAM Running, Solving & CFD | 0 | September 18, 2006 07:03 |
Serial run OK parallel one fails | r2d2 | OpenFOAM Running, Solving & CFD | 2 | November 16, 2005 13:44 |
parallel Vs. serial | co2 | FLUENT | 1 | December 31, 2003 03:19 |