|
[Sponsors] |
January 2, 2005, 17:16 |
Dear friends,
I ran a case
|
#1 |
Guest
Posts: n/a
|
Dear friends,
I ran a case with around 300,000 cells on 32 processors without a problem. When I increased the grid points to around 1,200,000 and used 64 processors, it didn't start and gave the following errr. I don't think this number of cells is too much for 64 processors. Has anybody experienced similar error. I would appreciate if you let me know what's wrong. The error message: ------------------------------------- new cannot satisfy memory request. This does not necessarily mean you have run out of virtual memory. It could be due to a stack violation causedby e.g. bad use of pointers or an out of date shared library |
|
January 2, 2005, 20:06 |
It sounds like the case is ru
|
#2 |
Guest
Posts: n/a
|
It sounds like the case is running on one processor, maybe 64 copies each on their own processor. FOAM uses between 1 and 2k per cell depending on the code so 1.2e6 sounds like it would fill 32bit addressing for some of the codes.
|
|
January 2, 2005, 20:40 |
Thanks a lot Henry,
You ar
|
#3 |
Guest
Posts: n/a
|
Thanks a lot Henry,
You are right. I had got such error when I wanted to run it on 1 machine, but this is a lot more processors. I'm using PBS to submit jobs randomly on 64 processos out of a larger cluster consisting of IBM Dual 3.0 GHz BladeCenter processors with over 2GB of memory each. Actually, for the smaller job (300,000 cells), the 32 parallel processors were only a little faster than when I ran it on a single machine (with a little higher memory and approximately same CPU speed). Is there any special partitioning method or other ways of improving the parallel effeciency for irregular geometries? (Now, I'm using simple decomposition method) The thing is that I got this message twice when I submitted this job, and for the 3rd try, it worked and surprisingly it started to work on 64 processors when I had decreased the number of subdomiains from 64 to 32 in decomposeParDict and decompositionDict. It seems whatever the number of subdomains in this two dicts, it works by 64 processors and gives no error. Is it what usually happens or it should give an error or message concerning the number of subdomains is not the same as the number of processors requested? Regards, |
|
January 3, 2005, 07:40 |
I am surprised the speed-up w
|
#4 |
Guest
Posts: n/a
|
I am surprised the speed-up was so small, we get much more than this. What is the inter-connect speed of your machine?
There are three decomposition techniques supplied with FOAM, thry the other two and look at the decomposition statistics decomposePar prints which will give you an idea of how effective the approach is for your case. You might also find it useful to play with scheduledTransfer 1; floatTransfer 0; nProcsSimpleSum 16; in .OpenFOAM-1.0/controlDict, in particular floatTransfer which could be set to 1 to enable the parallel transfer of data to be floats rather than doubles and possibly change scheduledTransfer and/or nProcsSimpleSum. I don't understand why you have two decompositon dictionaries, you should have only one and of course the information in it should correspond to the decomposition you are using! |
|
June 9, 2005, 17:27 |
I got the same message as Ali,
|
#5 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
I got the same message as Ali, when trying to decompose the case (decomposePar). Should I run it with mpirun? (I will try as soon as the parallel machine where I am running comes back to live...)
Processor 2 Number of cells = 1042872 Number of faces shared with processor 1 = 260718 Number of faces shared with processor 3 = 260718 Number of boundary faces = 9928 Processor 3 Number of cells = 1042872 Number of faces shared with processor 2 = 260718 Number of faces shared with processor 0 = 260718 Number of boundary faces = 9928 new cannot satisfy memory request. This does not necessarily mean you have run out of virtual memory. It could be due to a stack violation caused by e.g. bad use of pointers or an out of date shared library Aborted [luizebs@green01 oodles]$ |
|
June 10, 2005, 06:22 |
decomposePar has to hold the u
|
#6 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
decomposePar has to hold the undecomposed case and all the pieces it decomposes into. So it uses on average twice the storage the single mesh uses.
Maybe you just run out of memory? What does 'top show when you run decomposePar? |
|
June 10, 2005, 14:48 |
Yeah. I did run out of memory.
|
#7 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
Yeah. I did run out of memory.
Then, I tried to run it with lamexec (I am not sure I could, I am very inexperienced in parallel computations...): lamexec -np 4 decomposePar . GL3 </dev/null>& logd & Is this the right command? (I tried with mpirun first, but it looks like decomposePar is not an MPI application, is it?) Note there are 4 "Processor 3" in the output. I just printed the last 2. Thanks for your help, luiz Processor 3 Number of cells = 1042872 Number of faces shared with processor 2 = 260718 Number of faces shared with processor 0 = 260718 Number of boundary faces = 9928 Processor 3 Number of cells = 1042872 Number of faces shared with processor 2 = 260718 Number of faces shared with processor 0 = 260718 Number of boundary faces = 9928 new cannot satisfy memory request. This does not necessarily mean you have run out of virtual memory. It could be due to a stack violation caused by e.g. bad use of pointers or an out of date shared library 1765 (n1) exited due to signal 6 [luizebs@green01 oodles]$ |
|
June 10, 2005, 15:37 |
You cannot run decomposePar in
|
#8 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
You cannot run decomposePar in parallel.
|
|
June 10, 2005, 15:54 |
But then, how can I run a big
|
#9 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
But then, how can I run a big mesh in parallel that does not fit the memory requirement of an isolated node?
Does that mean that my mesh size to be run in parallel is limited by the memory requirement of a single node? Thanks a lot, luiz |
|
June 10, 2005, 16:09 |
What normally is being done:
|
#10 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
What normally is being done:
- have a computer with a lot of memory to do the decomposition on. - run on smaller nodes. Even better: - do your mesh generation in parallel (and no, blockMesh does not run in parallel) |
|
June 10, 2005, 17:02 |
thanks, Mattijs
Since this is
|
#11 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
thanks, Mattijs
Since this is the computer with higher mem I have, I have no other option but #2. But I have no Gambit or other mesh generator on this parallel machine, which means I would have to generate a gambit mesh in other computer (single node) and convert it using some Foam utility (gambitToFoam). Question: does gambitToFoam run in parallel? Or it has the same limitation as blockMesh? If I could not find a way to use gambit in a parellal machine, I will probably have to use decomposePar, which will again have memory problems, right? What about this: I sequentially construct 4 (number of nodes) smaller (4times) meshes using blockMesh and manually copy each of the polyMesh dir generated into processor0-3/constant/polyMesh. Then i change the boundary condition, trying to mimic a boundary file generated via decomposePar. Do you think it would work? Thanks, Luiz |
|
June 10, 2005, 17:07 |
Nope. decomposePar orders the
|
#12 |
Senior Member
Hrvoje Jasak
Join Date: Mar 2009
Location: London, England
Posts: 1,907
Rep Power: 33 |
Nope. decomposePar orders the faces on parallel boundaries in a special way (i.e. the ordering is the same on both sides of the parallel interface). The chance of getting this right without using decomposePar are slim AND you need to knwo exactly what you're doing...
Hmm, Hrv
__________________
Hrvoje Jasak Providing commercial FOAM/OpenFOAM and CFD Consulting: http://wikki.co.uk |
|
June 10, 2005, 19:05 |
Thanks Hrvoje,
In my case, I
|
#13 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
Thanks Hrvoje,
In my case, I have z-direction homogeneous geometry, and I am planing to partition it in the z-direction as well. Does this make my chances better? Again, this is my only possibility, since I have no way to generate a paralel mesh ready to be used my foam (in other words, without the need to first run gambitToFoam or decomposePar first). BTW, which of Foam utilities can be ran in paralel (with either mpirun or lamexec)? gambitToFoam, for instance? renumberMesh? Thanks a lot again, luiz |
|
June 10, 2005, 20:44 |
Well, you might have half a ch
|
#14 |
Senior Member
Hrvoje Jasak
Join Date: Mar 2009
Location: London, England
Posts: 1,907
Rep Power: 33 |
Well, you might have half a chance but...
- you'll have to do a ton of mesh manipulation by hand because I bet the front and back planes will be numbered differently - unless you grade the mesh (such that faces have different areas and you get a matching error), you might keep getting a running code with rubbish results - Mattijs might have written some utilities for re-ordering parallel (cyclic?) faces, which may be re-used. (I'm sure he'll pitch in with some ideas - thanks, Mattijs) :-) To put it straight, I have personally written the parallel mesh decomposition and reconstruction tools and I wouldn't want to be in your skin... It would be much easier to find a 64-bit machine and do the job there. Alternatively, make thick slices for each CPU, decompose the mesh and then use mesh refiniment on each piece separately to get the desired number of layers (or something similar). BTW, have you considered how you are going to look at results - paraFoam does not run in parallel either. Maybe some averaging in the homogenous direction or interpolation to a coarser mesh is in order. As for utilities, call them with no arguments and (most of them) should tell you. Off the cuff, I would say that mesh maniputation tools won't work in parallel but data post-processing (apart from graphics) will. Good luck, Hrv
__________________
Hrvoje Jasak Providing commercial FOAM/OpenFOAM and CFD Consulting: http://wikki.co.uk |
|
June 10, 2005, 22:55 |
"BTW, have you considered how
|
#15 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
"BTW, have you considered how you are going to look at results - paraFoam does not run in parallel either. Maybe some averaging in the homogenous direction or interpolation to a coarser mesh is in order."
Yes. I ve already built a corser mesh and mapped sucessufly (from a not so refined mesh case). Thanks a lot for your comments... luiz |
|
June 13, 2005, 17:47 |
Ok. I reduce a little bit the
|
#16 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
Ok. I reduce a little bit the mesh, and was able to run decomposePar without problems.
But when I run the case (with mpirun) I get: (Still looks like I have some memory problem (I mean, hopefully not me, but my simulation), doesnt it?) [0] Case : GL2meules [0] Nprocs : 4 [0] Slaves : 3 ( green02.5942 green03.4885 green04.4738 ) Create time Create mesh, no clear-out for time = 150 MPI_Bsend: unclassified: No buffer space available (rank 2, MPI_COMM_WORLD) Rank (2, MPI_COMM_WORLD): Call stack within LAM: Rank (2, MPI_COMM_WORLD): - MPI_Bsend() Rank (2, MPI_COMM_WORLD): - main() ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application. PID 22401 failed on node n0 (192.168.0.1) with exit status 1. ----------------------------------------------------------------------------- [1]+ Exit 1 mpirun -np 4 glLES . GL2meules -parallel 1>&logm [luizebs@green01 oodles]$ Thanks a lot, luiz |
|
June 13, 2005, 17:54 |
What happens if you increase M
|
#17 |
Senior Member
Join Date: Mar 2009
Posts: 854
Rep Power: 22 |
What happens if you increase MPI_BUFFER_SIZE?
|
|
June 13, 2005, 18:10 |
The same thing. (only rank 2 c
|
#18 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
The same thing. (only rank 2 changed to rank 1)
What should be this value? It was 20000000. Thanks, luiz green02.6630 green03.5573 green04.5426 ) Create time Create mesh, no clear-out for time = 150 MPI_Bsend: unclassified: No buffer space available (rank 1, MPI_COMM_WORLD) ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application. PID 23152 failed on node n0 (192.168.0.1) with exit status 1. ----------------------------------------------------------------------------- Rank (1, MPI_COMM_WORLD): Call stack within LAM: Rank (1, MPI_COMM_WORLD): - MPI_Bsend() Rank (1, MPI_COMM_WORLD): - main() [luizebs@green01 oodles]$ echo %MPI_BUFFER_SIZE %MPI_BUFFER_SIZE [1]+ Exit 1 mpirun -np 4 glLES . GL2meules -parallel </dev/null>&logm [luizebs@green01 oodles]$ |
|
June 14, 2005, 06:52 |
Too hard to calculate.
Just
|
#19 |
Senior Member
Mattijs Janssens
Join Date: Mar 2009
Posts: 1,419
Rep Power: 26 |
Too hard to calculate.
Just double it (and make sure to 'lamwipe' and 'lamboot' so the new settings are known by lamd) and try again. Keep on doing until you don't get this message. |
|
June 15, 2005, 17:01 |
Thanks Mattijs,
It is working
|
#20 |
Member
diablo80@web.de
Join Date: Mar 2009
Posts: 93
Rep Power: 17 |
Thanks Mattijs,
It is working now. But what would be the consequences of an unecessary higher value of this buffer size? Where can I learn more about all these things (mostly linux and running linux in parallel)? I feel so week (I only later found out that I should put my export MPI_BUFFER_SIZE=xxxxx in my bashrc, but I am not even sure why... I suspect it has to do with exporting to all nodes instead of just the current one...) Could you provide some pointers (linux and parallel stuff)? Books, online tutorials, etc... I really feel the need to know better what is happening, but i dont know how to start... thanks, luiz |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Insufficient virtual memory | sammi | Phoenics | 4 | April 8, 2009 14:32 |
Memory requirements for serial and parallel runs | denner | OpenFOAM Running, Solving & CFD | 0 | August 26, 2008 16:11 |
High performance virtual memory tip | connclark | OpenFOAM Running, Solving & CFD | 0 | December 5, 2007 19:35 |
Help with virtual memory | tom | Phoenics | 10 | July 19, 2007 15:50 |
virtual memory? | tj | Phoenics | 1 | February 3, 2005 12:40 |