CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Running, Solving & CFD

Slow Cases using HPC and PBS job scheduler

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   January 24, 2015, 21:15
Unhappy Slow Cases using HPC and PBS job scheduler
  #1
Senior Member
 
Alhasan's Avatar
 
Hasan K.J.
Join Date: Dec 2011
Location: Bristol, United Kingdom
Posts: 200
Rep Power: 15
Alhasan is on a distinguished road
Hey Everyone,

I have been using HPC at a different university for a year !! they used PBS scheduler and I had no issues like this what so ever !!

Now i have moved to this different university and when I am trying to use the HPC and PBS job scheduler again here I am having their weird problem.

- My open foam case runs normally using 64 processors just a 2D simpleFoam case.
- but suddenly some cases run extremely slow.
- so to double check I submitted the same case 5 - 6 times as 5 different jobs exact same case, One or two run normally and the others run extremely slow..???

what might me be happening ? here any tips or help, I have installed OpenFOAM in my own directory could this be causing the problem ?

PS. Now I am running the same exact cases on my Workstation with no issues, so the problem is not with my case setup.

If you require any further information please let me know

Thanks for your time,
Hasan K.J
__________________
"Real knowledge is to know the extent of one's ignorance." - Confucius
Alhasan is offline   Reply With Quote

Old   January 25, 2015, 11:28
Default
  #2
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer: Notes about running OpenFOAM in parallel
Quote:
Diagnosing limitations: Parallel Performance of Large Case post #4
__________________
wyldckat is offline   Reply With Quote

Old   January 26, 2015, 10:39
Default
  #3
Senior Member
 
Alhasan's Avatar
 
Hasan K.J.
Join Date: Dec 2011
Location: Bristol, United Kingdom
Posts: 200
Rep Power: 15
Alhasan is on a distinguished road
Hey Bruno,

Thanks for your reply and sorry If I had the post in the wrong place, I thought it was a PBS problem so posted it there.

Those were very helpful links I learnt and I am stilling learning lot of tricks from those links and well some posts go over my head.

Coming back to my problem, I havent come across any similar problems in those links. (where it is working sometimes and not working sometimes)

Let me be clearer this time.

- I just have a simple airfoil case that runs very well on my Xenon workstation with no issues.

- The same case runs fine on the HPC cluster too with no issues.

- But suddenly the same case starts running extremely slow on the cluster But I also want to say this does not happen suddenly after couple 100 timesteps It either starts very slow from time step 1 or it is superfast like how it is supposed to be. (when i mean super slow 140 timesteps with 64 processors for 8 hrs. and when I mean super fast about 4 time steps per second )

- so just to check if something wrong with my case or cluster I submitted the exact same case with no modifications just copy pasted and changed their names and submitted them as 5 different jobs. And out of the 5 cases, 2 were superfast which is the normal speed and the other 3 were Exremely slow.

- I asked about this to my HPC administrators they have no answer for me on this topic especially withopenfoam and they have not had this issue with anyother software.

- I have also not installed paraView just openFOAM on the cluster on my own personal directory. the only difference in openFOAM between my Xenon and the cluste openFOAM is the installation of paraView in my Xenon and No paraView in the cluster other than that no difference.

- I have no idea what could be even causing this and I dunno how to make the problem stop, only thing that comes to my mind is if I am sharing half the number of processors from one node and the other half number of processor from another node but even then it cant be this slow !!!!

Thanks for your time,
Hasan K.J
__________________
"Real knowledge is to know the extent of one's ignorance." - Confucius
Alhasan is offline   Reply With Quote

Old   January 26, 2015, 16:55
Default
  #4
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Hi Hasan K.J,

OK, my guess is that you're tripping over NFS lagging. In other words, there is a delay in communication in some of the situations, because the solvers are stuck waiting for disk feedback.

See subsection "3.2.5 Debug messaging and optimisation switches" of the OpenFOAM User Guide: http://www.openfoam.org/docs/user/co...plications.php - and have a look at the "Optimisation switches" topic therein.

Best regards,
Bruno
wyldckat is offline   Reply With Quote

Old   February 1, 2015, 09:53
Default
  #5
Senior Member
 
Alhasan's Avatar
 
Hasan K.J.
Join Date: Dec 2011
Location: Bristol, United Kingdom
Posts: 200
Rep Power: 15
Alhasan is on a distinguished road
Hi Bruno,

I did look at 3.2.5, to be honest It is going over my head I did not understand most of it

I did go to the WM_PROJECT_DIR/etc/controlDict file and saw there was a list a things and everything had a 0 next to it.

I am seriously lost, I had a look at it couple of different days and couple of different times, I could not come to a conclusion on what I was supposed to do. any other guidance ? I have to sort out this problem to run LES on my university HPC

Thanks,
Hasan K.J
__________________
"Real knowledge is to know the extent of one's ignorance." - Confucius
Alhasan is offline   Reply With Quote

Old   February 1, 2015, 12:07
Default
  #6
Retired Super Moderator
 
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128
wyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to allwyldckat is a name known to all
Quick answer:
Quote:
fileModificationSkew
Atime in seconds that should be set higher than the maximum delay in NFS updates and clock difference for running OpenFOAM over a NFS.

fileModificationChecking
Method of checking whether files have been modified during a simulation, either reading the timeStamp or using inotify; versions that read only master-node data exist, timeStampMaster, inotifyMaster.
Try the following combinations:
  • Code:
    fileModificationSkew  0;
    fileModificationChecking  timeStampMaster;
  • Code:
    fileModificationSkew  120;
    fileModificationChecking  timeStampMaster;
  • Code:
    fileModificationSkew  0;
    fileModificationChecking    inotifyMaster;
  • Code:
    fileModificationSkew  120;
    fileModificationChecking    inotifyMaster;
Whichever works, works.

Beyond this, try editing in your case's folder the file "system/controlDict" and change the respective parameter to this:
Code:
runTimeModifiable false;
Further beyond, try disabling writing time snapshots (set "writeInterval" to a really big number) in order to diagnose if the limitation is related to storing data on disk or not.
wyldckat is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
help to run CFX simulation via HPC, please happy CFX 19 February 28, 2016 18:18
Problems using a qsub (PBS) job-scheduler: "no access to tty (Bad file descriptor)" heliana60 OpenFOAM Running, Solving & CFD 4 January 28, 2015 08:14
interFoam process forking on HPC using PBS JFM OpenFOAM Running, Solving & CFD 2 February 4, 2014 09:49
Fluent jobs through pbs ibnkureshi FLUENT 5 June 9, 2011 14:43


All times are GMT -4. The time now is 14:37.