Memory leak?

taalf · April 27, 2015, 14:53

Hi all,

I launched a CFD computation on Friday using a mesh having several million nodes. I used the parallel_computation.py script. Everything was fine, and I let it run through the week-end.

Today morning, I found my computer out of memory with the swap filled up to a high percentage.

I checked the memory consumption and found out that one of the SU2_CFD processes was taking an abnormal amount of memory. I stopped the computation and freed the swap.

Then I launched again the parallel_computation.py script, and once again one SU2_CFD process takes more and more memory with time : 4.5Gb after few hours instead of 2.5Gb for other ones.

I thought I would find some info about this on this forum. Am I the only one who face something like this?

Any help would be very appreciated.

Best regards,

William

leejearl · May 5, 2015, 23:02

I have meet such a problem, but when the memory approaches about two times of the initial memory, it does not increase again.

taalf · May 6, 2015, 05:25

Quote:

Originally Posted by leejearl

I have meet such a problem, but when the memory approaches about two times of the initial memory, it does not increase again.

Hello, thank you for sharing your experience !

Yesterday I launched a new parallel computation. Today morning, all SU2_CFD processes took around 3.5Gb excepted one taking more than 10Gb...

I don't know if the taken memory would increase again, because when I am in front of my computer I want it to go fast, so I stopped SU2 to clear the swap...

I plan to use SU2 on another machine. Let's see if this memory problem occurs there as well...

Best regards,

William

leejearl · May 6, 2015, 09:28

Quote:

Originally Posted by taalf

Hello, thank you for sharing your experience !

Yesterday I launched a new parallel computation. Today morning, all SU2_CFD processes took around 3.5Gb excepted one taking more than 10Gb...

I don't know if the taken memory would increase again, because when I am in front of my computer I want it to go fast, so I stopped SU2 to clear the swap...

I plan to use SU2 on another machine. Let's see if this memory problem occurs there as well...

Best regards,

William

The taken memory confuse me very much, there are some problems I can not understand, but the SU2 really works! I expect you for sharing your experience.

taalf · May 7, 2015, 04:56

You are maybe right,

It seems that the amount of memory stabilizes with time.

So it's not any "memory leak", but just a (quite strange) increase of taken memory which disappears with time.

Else, indeed, SU2 works fine

Best regards,

William

taalf · May 25, 2015, 05:02

Bad news for me...

I thought the taken memory stabilized because in my graphical System Monitor it shown only the amount a memory taken in the RAM, not in the swap.

The fact is that it continues increasing until the swap is filled up, then it crashes with an error. This explains why each Monday I found my computation stopped with a strange error :

Code:

 3638  58.852567     -7.254947     -2.301004      -0.803455       3.387501
 3639  58.850130     -7.255022     -2.301004      -0.803456       3.387498

log10[Maximum residual]: -4.92429.
Max residual point 419743 is located at (-72.6361, 1.18272, 3.39547).

 Iter    Time(s)      Res[Rho]       Res[nu]   CLift(Total)   CDrag(Total)
 3640  58.847802     -7.255110     -2.301003      -0.803457       3.387495
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 26329 on node ##### exited on signal 9 (Killed).
--------------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/bin/parallel_computation.py", line 117, in <module>
    main()
  File "/usr/local/bin/parallel_computation.py", line 62, in main
    options.divide_grid  )
  File "/usr/local/bin/parallel_computation.py", line 94, in parallel_computation
    info = SU2.run.CFD(config)
  File "/usr/local/bin/SU2/run/interface.py", line 93, in CFD
    run_command( the_Command )
  File "/usr/local/bin/SU2/run/interface.py", line 277, in run_command
    raise Exception , message
Exception: Path = #####/,
Command = mpirun -n 6 /usr/local/bin/SU2_CFD config_CFD.cfg
SU2 process returned error '137'
mpirun: Forwarding signal 20 to job
mpirun: Forwarding signal 18 to job

So, I am facing up a real memory leak.

After testing SU2 on another machine, I will let know if the problem occurs there also.

For info, I compiled SU2 3.2.3 from sources on Linux Mint 13.

Best regards,

William

zkdkeen · September 15, 2015, 09:29

Hi William, have you solved your problem? I am facing this kind of problem too. After updating to 4.0.0 from 3.2.3, the high memory allocating issue seems becoming havier. For example, I launched a rans simulation with serveral million nodes mesh on 2 nodes 32 cores, the memory used is several times higher than other solvers such as CFX, and sometimes, it reached to the top.

taalf · September 15, 2015, 09:44

Quote:

Originally Posted by zkdkeen

Hi William, have you solved your problem? I am facing this kind of problem too. After updating to 4.0.0 from 3.2.3, the high memory allocating issue seems becoming havier. For example, I launched a rans simulation with serveral million nodes mesh on 2 nodes 32 cores, the memory used is several times higher than other solvers such as CFX, and sometimes, it reached to the top.

Hi,

I am sorry but I found out that SU2 could not compute what I needed (non-available boundary condition type) and used OpenFOAM instead, so I did not investigated further about this memory problem in SU2.

Nonetheless, I am interested by this issue so don't hesitate to share your own experience.

Best regards,

William

economon · September 15, 2015, 17:54

Hi there,

Yes, with the move to fully parallel partitioning (in memory) with ParMETIS, some of the memory requirements did indeed change w/ v4.0. We are always working to bring the memory requirements down, and this will continue to improve.

However, did you find that the memory usage stays high beyond the partitioning stage? I would expect that the overall memory footprint should drop quite a bit once the partitioning is complete for the rest of the calculation.

Hope this helps,
Tom

zkdkeen · September 16, 2015, 00:41

Hi Tom, yes I have found the memory usage stays high after the partitioning, without obvious decrease judging from gridview. I will keep tracking the memory part.

OVS · March 9, 2016, 13:40

Hello,

I too encountered this problem,

I sent the rans/oneraM6 Testcase to a cluster on 96 procs (overkill, I know) and results were unexpected:

1. Given the convergence criteria (Cauchy, 1e-6, 100 iterations), the solution never converged and oscillated instead.

2. The program ran for a bit less than 500000 iterations (while the oscillatory solution was found well before the 70 000 iterations) before the cluster killed it for excessive memory usage on one node (I had 1.7GB memory per processor (12 procs per node)... which is overkill for a 48 000 cells problem).

I had never seen this behaviour before... but at the same time, I had never run for so much iterations in one go. I'd usually cap my iterations at 100 000 and then restart the calculation if it didn't converge.

So it seems there's a real memory leakage.

Any thoughts on this?

Oliver.

tomf · March 10, 2016, 05:08

I recently started with SU2 and also found a large memory footprint. However I do not think there is a leak, or at least no large leak, since the simulations start off at a high level and only temporarily have a higher footprint when writing restart files. To compare with OpenFOAM I typically find about 6 times higher memory use.

OVS · March 10, 2016, 12:57

Quote:

Originally Posted by tomf

I recently started with SU2 and also found a large memory footprint. However I do not think there is a leak, or at least no large leak, since the simulations start off at a high level and only temporarily have a higher footprint when writing restart files. To compare with OpenFOAM I typically find about 6 times higher memory use.

You mean openFoam takes 6 times more memory than su2 or the other way around?

tomf · March 10, 2016, 13:36

SU2 takes 6 times more than OpenFOAM (but OpenFOAM uses face centers for the FVM)

HolyGeneralK · April 10, 2016, 10:11

There's no shortage of memory leaks in the code (I've just joined the project since I have some time after a job change). I'm working to address them, but need some guidance from the project staff as to language features I'm allowed to use. Stay tuned.

OVS · May 5, 2016, 15:01

Hello,

Any updates on this subject HollyGeneralK?

Thanks
Oliver

economon · May 5, 2016, 15:17

Hi all,

I would also be curious to hear folks' experience with the latest versions of the code (try the develop branch on GitHub, or v4.1.2). We've recently improved memory usage and fixed some leaks. At the very least, the memory required by the partitioning stage has been reduced noticeably.

Take care,
Tom

af.nino235 · May 11, 2016, 21:41

Hello,

My simulation is killed suddenly without a reason, for example I am in a given iteration and the process is stopped without some message.
Then i run the simulation again and SU2 return error "137".

has anyone had the same problem ?

is it a memory problem, ( i am runing a mesh with 13E6 elements) ?

Thanks.

April 27, 2015, 14:53	Memory leak?	#1
taalf Member William Tougeron Join Date: Jan 2011 Location: Czech Republic Posts: 70 Rep Power: 15	Hi all, I launched a CFD computation on Friday using a mesh having several million nodes. I used the parallel_computation.py script. Everything was fine, and I let it run through the week-end. Today morning, I found my computer out of memory with the swap filled up to a high percentage. I checked the memory consumption and found out that one of the SU2_CFD processes was taking an abnormal amount of memory. I stopped the computation and freed the swap. Then I launched again the parallel_computation.py script, and once again one SU2_CFD process takes more and more memory with time : 4.5Gb after few hours instead of 2.5Gb for other ones. I thought I would find some info about this on this forum. Am I the only one who face something like this? Any help would be very appreciated. Best regards, William

May 11, 2016, 21:41	Help, the same problem.!	#18
af.nino235 New Member Andres Niño Join Date: May 2016 Posts: 1 Rep Power: 0	Hello, My simulation is killed suddenly without a reason, for example I am in a given iteration and the process is stopped without some message. Then i run the simulation again and SU2 return error "137". has anyone had the same problem ? is it a memory problem, ( i am runing a mesh with 13E6 elements) ? Thanks.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
AMI memory leak?	MichiB	OpenFOAM Programming & Development	14	August 1, 2015 19:18
Run-time memory allocation error	akalopsis	CFX	0	November 17, 2014 18:17
Lenovo C30 memory configuration and discussions with Lenovo	matthewe	Hardware	3	October 17, 2013 11:23
Memory leak in OpenFOAM?	marupio	OpenFOAM Bugs	8	October 14, 2010 13:49
CFX CPU time & real time	Nick Strantzias	CFX	8	July 23, 2006 18:50

May 5, 2015, 23:02		#2
leejearl Member Li Ji Join Date: Sep 2014 Posts: 33 Rep Power: 12	I have meet such a problem, but when the memory approaches about two times of the initial memory, it does not increase again.

May 7, 2015, 04:56		#5
taalf Member William Tougeron Join Date: Jan 2011 Location: Czech Republic Posts: 70 Rep Power: 15	You are maybe right, It seems that the amount of memory stabilizes with time. So it's not any "memory leak", but just a (quite strange) increase of taken memory which disappears with time. Else, indeed, SU2 works fine Best regards, William

September 15, 2015, 09:29		#7
zkdkeen Member Antoni Alexander Join Date: Nov 2009 Posts: 43 Rep Power: 17	Hi William, have you solved your problem? I am facing this kind of problem too. After updating to 4.0.0 from 3.2.3, the high memory allocating issue seems becoming havier. For example, I launched a rans simulation with serveral million nodes mesh on 2 nodes 32 cores, the memory used is several times higher than other solvers such as CFX, and sometimes, it reached to the top.

September 15, 2015, 17:54		#9
economon Super Moderator Thomas D. Economon Join Date: Jan 2013 Location: Stanford, CA Posts: 271 Rep Power: 14	Hi there, Yes, with the move to fully parallel partitioning (in memory) with ParMETIS, some of the memory requirements did indeed change w/ v4.0. We are always working to bring the memory requirements down, and this will continue to improve. However, did you find that the memory usage stays high beyond the partitioning stage? I would expect that the overall memory footprint should drop quite a bit once the partitioning is complete for the rest of the calculation. Hope this helps, Tom

September 16, 2015, 00:41		#10
zkdkeen Member Antoni Alexander Join Date: Nov 2009 Posts: 43 Rep Power: 17	Hi Tom, yes I have found the memory usage stays high after the partitioning, without obvious decrease judging from gridview. I will keep tracking the memory part.

March 9, 2016, 13:40		#11
OVS New Member Oliver V Join Date: Dec 2015 Posts: 17 Rep Power: 11	Hello, I too encountered this problem, I sent the rans/oneraM6 Testcase to a cluster on 96 procs (overkill, I know) and results were unexpected: 1. Given the convergence criteria (Cauchy, 1e-6, 100 iterations), the solution never converged and oscillated instead. 2. The program ran for a bit less than 500000 iterations (while the oscillatory solution was found well before the 70 000 iterations) before the cluster killed it for excessive memory usage on one node (I had 1.7GB memory per processor (12 procs per node)... which is overkill for a 48 000 cells problem). I had never seen this behaviour before... but at the same time, I had never run for so much iterations in one go. I'd usually cap my iterations at 100 000 and then restart the calculation if it didn't converge. So it seems there's a real memory leakage. Any thoughts on this? Oliver.

March 10, 2016, 05:08		#12
tomf Senior Member Tom Fahner Join Date: Mar 2009 Location: Breda, Netherlands Posts: 647 Rep Power: 32	I recently started with SU2 and also found a large memory footprint. However I do not think there is a leak, or at least no large leak, since the simulations start off at a high level and only temporarily have a higher footprint when writing restart files. To compare with OpenFOAM I typically find about 6 times higher memory use.

March 10, 2016, 13:36		#14
tomf Senior Member Tom Fahner Join Date: Mar 2009 Location: Breda, Netherlands Posts: 647 Rep Power: 32	SU2 takes 6 times more than OpenFOAM (but OpenFOAM uses face centers for the FVM)

April 10, 2016, 10:11		#15
HolyGeneralK New Member Kevin K Join Date: Apr 2016 Location: Virginia Posts: 1 Rep Power: 0	There's no shortage of memory leaks in the code (I've just joined the project since I have some time after a job change). I'm working to address them, but need some guidance from the project staff as to language features I'm allowed to use. Stay tuned.

May 5, 2016, 15:01		#16
OVS New Member Oliver V Join Date: Dec 2015 Posts: 17 Rep Power: 11	Hello, Any updates on this subject HollyGeneralK? Thanks Oliver

May 5, 2016, 15:17		#17
economon Super Moderator Thomas D. Economon Join Date: Jan 2013 Location: Stanford, CA Posts: 271 Rep Power: 14	Hi all, I would also be curious to hear folks' experience with the latest versions of the code (try the develop branch on GitHub, or v4.1.2). We've recently improved memory usage and fixed some leaks. At the very least, the memory required by the partitioning stage has been reduced noticeably. Take care, Tom