|
[Sponsors] |
April 27, 2015, 14:53 |
Memory leak?
|
#1 |
Member
William Tougeron
Join Date: Jan 2011
Location: Czech Republic
Posts: 70
Rep Power: 15 |
Hi all,
I launched a CFD computation on Friday using a mesh having several million nodes. I used the parallel_computation.py script. Everything was fine, and I let it run through the week-end. Today morning, I found my computer out of memory with the swap filled up to a high percentage. I checked the memory consumption and found out that one of the SU2_CFD processes was taking an abnormal amount of memory. I stopped the computation and freed the swap. Then I launched again the parallel_computation.py script, and once again one SU2_CFD process takes more and more memory with time : 4.5Gb after few hours instead of 2.5Gb for other ones. I thought I would find some info about this on this forum. Am I the only one who face something like this? Any help would be very appreciated. Best regards, William |
|
May 5, 2015, 23:02 |
|
#2 |
Member
Li Ji
Join Date: Sep 2014
Posts: 33
Rep Power: 12 |
I have meet such a problem, but when the memory approaches about two times of the initial memory, it does not increase again.
|
|
May 6, 2015, 05:25 |
|
#3 | |
Member
William Tougeron
Join Date: Jan 2011
Location: Czech Republic
Posts: 70
Rep Power: 15 |
Quote:
Yesterday I launched a new parallel computation. Today morning, all SU2_CFD processes took around 3.5Gb excepted one taking more than 10Gb... I don't know if the taken memory would increase again, because when I am in front of my computer I want it to go fast, so I stopped SU2 to clear the swap... I plan to use SU2 on another machine. Let's see if this memory problem occurs there as well... Best regards, William |
||
May 6, 2015, 09:28 |
|
#4 | |
Member
Li Ji
Join Date: Sep 2014
Posts: 33
Rep Power: 12 |
Quote:
|
||
May 7, 2015, 04:56 |
|
#5 |
Member
William Tougeron
Join Date: Jan 2011
Location: Czech Republic
Posts: 70
Rep Power: 15 |
You are maybe right,
It seems that the amount of memory stabilizes with time. So it's not any "memory leak", but just a (quite strange) increase of taken memory which disappears with time. Else, indeed, SU2 works fine Best regards, William |
|
May 25, 2015, 05:02 |
|
#6 |
Member
William Tougeron
Join Date: Jan 2011
Location: Czech Republic
Posts: 70
Rep Power: 15 |
Bad news for me...
I thought the taken memory stabilized because in my graphical System Monitor it shown only the amount a memory taken in the RAM, not in the swap. The fact is that it continues increasing until the swap is filled up, then it crashes with an error. This explains why each Monday I found my computation stopped with a strange error : Code:
3638 58.852567 -7.254947 -2.301004 -0.803455 3.387501 3639 58.850130 -7.255022 -2.301004 -0.803456 3.387498 log10[Maximum residual]: -4.92429. Max residual point 419743 is located at (-72.6361, 1.18272, 3.39547). Iter Time(s) Res[Rho] Res[nu] CLift(Total) CDrag(Total) 3640 58.847802 -7.255110 -2.301003 -0.803457 3.387495 -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 26329 on node ##### exited on signal 9 (Killed). -------------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/bin/parallel_computation.py", line 117, in <module> main() File "/usr/local/bin/parallel_computation.py", line 62, in main options.divide_grid ) File "/usr/local/bin/parallel_computation.py", line 94, in parallel_computation info = SU2.run.CFD(config) File "/usr/local/bin/SU2/run/interface.py", line 93, in CFD run_command( the_Command ) File "/usr/local/bin/SU2/run/interface.py", line 277, in run_command raise Exception , message Exception: Path = #####/, Command = mpirun -n 6 /usr/local/bin/SU2_CFD config_CFD.cfg SU2 process returned error '137' mpirun: Forwarding signal 20 to job mpirun: Forwarding signal 18 to job After testing SU2 on another machine, I will let know if the problem occurs there also. For info, I compiled SU2 3.2.3 from sources on Linux Mint 13. Best regards, William |
|
September 15, 2015, 09:29 |
|
#7 |
Member
Antoni Alexander
Join Date: Nov 2009
Posts: 43
Rep Power: 17 |
Hi William, have you solved your problem? I am facing this kind of problem too. After updating to 4.0.0 from 3.2.3, the high memory allocating issue seems becoming havier. For example, I launched a rans simulation with serveral million nodes mesh on 2 nodes 32 cores, the memory used is several times higher than other solvers such as CFX, and sometimes, it reached to the top.
|
|
September 15, 2015, 09:44 |
|
#8 | |
Member
William Tougeron
Join Date: Jan 2011
Location: Czech Republic
Posts: 70
Rep Power: 15 |
Quote:
I am sorry but I found out that SU2 could not compute what I needed (non-available boundary condition type) and used OpenFOAM instead, so I did not investigated further about this memory problem in SU2. Nonetheless, I am interested by this issue so don't hesitate to share your own experience. Best regards, William |
||
September 15, 2015, 17:54 |
|
#9 |
Super Moderator
Thomas D. Economon
Join Date: Jan 2013
Location: Stanford, CA
Posts: 271
Rep Power: 14 |
Hi there,
Yes, with the move to fully parallel partitioning (in memory) with ParMETIS, some of the memory requirements did indeed change w/ v4.0. We are always working to bring the memory requirements down, and this will continue to improve. However, did you find that the memory usage stays high beyond the partitioning stage? I would expect that the overall memory footprint should drop quite a bit once the partitioning is complete for the rest of the calculation. Hope this helps, Tom |
|
September 16, 2015, 00:41 |
|
#10 |
Member
Antoni Alexander
Join Date: Nov 2009
Posts: 43
Rep Power: 17 |
Hi Tom, yes I have found the memory usage stays high after the partitioning, without obvious decrease judging from gridview. I will keep tracking the memory part.
|
|
March 9, 2016, 13:40 |
|
#11 |
New Member
Oliver V
Join Date: Dec 2015
Posts: 17
Rep Power: 11 |
Hello,
I too encountered this problem, I sent the rans/oneraM6 Testcase to a cluster on 96 procs (overkill, I know) and results were unexpected: 1. Given the convergence criteria (Cauchy, 1e-6, 100 iterations), the solution never converged and oscillated instead. 2. The program ran for a bit less than 500000 iterations (while the oscillatory solution was found well before the 70 000 iterations) before the cluster killed it for excessive memory usage on one node (I had 1.7GB memory per processor (12 procs per node)... which is overkill for a 48 000 cells problem). I had never seen this behaviour before... but at the same time, I had never run for so much iterations in one go. I'd usually cap my iterations at 100 000 and then restart the calculation if it didn't converge. So it seems there's a real memory leakage. Any thoughts on this? Oliver. |
|
March 10, 2016, 05:08 |
|
#12 |
Senior Member
|
I recently started with SU2 and also found a large memory footprint. However I do not think there is a leak, or at least no large leak, since the simulations start off at a high level and only temporarily have a higher footprint when writing restart files. To compare with OpenFOAM I typically find about 6 times higher memory use.
|
|
March 10, 2016, 12:57 |
|
#13 | |
New Member
Oliver V
Join Date: Dec 2015
Posts: 17
Rep Power: 11 |
Quote:
|
||
March 10, 2016, 13:36 |
|
#14 |
Senior Member
|
SU2 takes 6 times more than OpenFOAM (but OpenFOAM uses face centers for the FVM)
|
|
April 10, 2016, 10:11 |
|
#15 |
New Member
|
There's no shortage of memory leaks in the code (I've just joined the project since I have some time after a job change). I'm working to address them, but need some guidance from the project staff as to language features I'm allowed to use. Stay tuned.
|
|
May 5, 2016, 15:01 |
|
#16 |
New Member
Oliver V
Join Date: Dec 2015
Posts: 17
Rep Power: 11 |
Hello,
Any updates on this subject HollyGeneralK? Thanks Oliver |
|
May 5, 2016, 15:17 |
|
#17 |
Super Moderator
Thomas D. Economon
Join Date: Jan 2013
Location: Stanford, CA
Posts: 271
Rep Power: 14 |
Hi all,
I would also be curious to hear folks' experience with the latest versions of the code (try the develop branch on GitHub, or v4.1.2). We've recently improved memory usage and fixed some leaks. At the very least, the memory required by the partitioning stage has been reduced noticeably. Take care, Tom |
|
May 11, 2016, 21:41 |
Help, the same problem.!
|
#18 |
New Member
Andres Niņo
Join Date: May 2016
Posts: 1
Rep Power: 0 |
Hello,
My simulation is killed suddenly without a reason, for example I am in a given iteration and the process is stopped without some message. Then i run the simulation again and SU2 return error "137". has anyone had the same problem ? is it a memory problem, ( i am runing a mesh with 13E6 elements) ? Thanks. |
|
Tags |
memory leak, su2 |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
AMI memory leak? | MichiB | OpenFOAM Programming & Development | 14 | August 1, 2015 19:18 |
Run-time memory allocation error | akalopsis | CFX | 0 | November 17, 2014 18:17 |
Lenovo C30 memory configuration and discussions with Lenovo | matthewe | Hardware | 3 | October 17, 2013 11:23 |
Memory leak in OpenFOAM? | marupio | OpenFOAM Bugs | 8 | October 14, 2010 13:49 |
CFX CPU time & real time | Nick Strantzias | CFX | 8 | July 23, 2006 18:50 |