CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > OpenFOAM > OpenFOAM Post-Processing

bash script for pseudo-parallel usage of reconstructPar

Register Blogs Community New Posts Updated Threads Search

Like Tree58Likes

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 18, 2014, 03:06
Default
  #21
Member
 
hannes
Join Date: Mar 2013
Posts: 47
Rep Power: 13
hanness is on a distinguished road
Hi,
could you post some more information about your case? From what you are writing I would assume that you only have two timesteps to reconstruct because the way the script is written it will never start more jobs than there are timesteps left.

Hannes
hanness is offline   Reply With Quote

Old   September 11, 2014, 06:40
Default Error while running parReconstructPar
  #22
kcn
Member
 
Join Date: May 2014
Posts: 31
Rep Power: 12
kcn is on a distinguished road
Hi,

I get the following error while trying to run parReconstructPar.

Starting Job 1 - reconstructing time = 0.1 through 166.6
Job started with PID 10452
Starting Job 2 - reconstructing time = 166.7 through 234.1
Job started with PID 10462
Starting Job 3 - reconstructing time = 234.2 through 31.7
Job started with PID 10472
Starting Job 4 - reconstructing time = 31.8 through 300
Job started with PID 10482


--> FOAM FATAL ERROR:
No times selected

From function reconstructPar
in file reconstructPar.C at line 210.

FOAM exiting

If you notice the time ranges in job 3 and 4 there a bit strange too.

Can someone please tell me how to correct this?

Thanks
kcn
kcn is offline   Reply With Quote

Old   September 11, 2014, 06:54
Default
  #23
Senior Member
 
Bernhard Linseisen
Join Date: May 2010
Location: Heilbronn
Posts: 183
Blog Entries: 1
Rep Power: 16
Linse is on a distinguished road
Hi kcn!

I can only GUESS, but is it maybe related to the timestep given in your controlDict?
Otherwise: Did you already try if it would work with two processors only? Maybe the division of timesteps is giving values which do not work?
Or maybe you could try to decompose timestep 0 as well, so it would work beginning from timestep 0 instead of 0.1?

As I said: It is all only guessing, but I think these would be the approaches I would take for further testing...

Cheers,
Bernhard
Linse is offline   Reply With Quote

Old   September 11, 2014, 19:27
Default Minor Fix to get all numeric directories
  #24
New Member
 
Will
Join Date: Dec 2011
Posts: 17
Rep Power: 14
willzyba is on a distinguished road
The errors reported above are due to a bug in the way the script lists list the time files in processor0. They needed to be ordered by value using the "1v" flag, otherwise you can get the times in the wrong order, e.g. ... 0.7, 0.8, 1, 10, 1.1, 1.2 ....

If the number of processors is such that one processor picks up the range from 10 to 1.9 (say), when it should be from 1.1 to 1.9, then you have a problem.

Simply replace all occurrences of "ls processor0 | ...." with "ls processor0 -1v | ...."

Modified script attached.

parReconstructPar.txt

Otherwise a great script. Thankyou.
willzyba is offline   Reply With Quote

Old   September 18, 2014, 02:45
Default
  #25
kcn
Member
 
Join Date: May 2014
Posts: 31
Rep Power: 12
kcn is on a distinguished road
Dear Will,

Thank you very much for the corrected script.

kcn
kcn is offline   Reply With Quote

Old   October 7, 2014, 07:21
Default
  #26
New Member
 
sd
Join Date: May 2014
Posts: 14
Rep Power: 12
OvGU is on a distinguished road
Quote:
Originally Posted by kwardle View Post
All,
I thought I would post this as maybe someone else will find it useful. I wrote a short script (not necessarily pretty, but it works...) that runs reconstructPar in pseudo-parallel mode by breaking the time directories into a number of ranges and running multiple instances of reconstructPar. For lack of a better name I have called it parReconstructPar.
Enjoy.
-Kent

(The forum won't take a file without an extension so it is uploaded as a .txt -- just save it somewhere in your path as parReconstructPar and make it executable.)


I tried to put your file in my case directory and execute ./parReconstructPar but it doesn't work. Again I put it in my opt/bin directory for in case but also failed. Can you tell me where Im making mistake??
OvGU is offline   Reply With Quote

Old   October 7, 2014, 08:08
Default
  #27
Senior Member
 
Joachim Herb
Join Date: Sep 2010
Posts: 650
Rep Power: 22
jherb is on a distinguished road
Have you set the executable flag?
Code:
chmod a+x parReconstructPar
In any case you should be able to start it with
Code:
sh parReconstructPar
kwardle likes this.
jherb is offline   Reply With Quote

Old   October 14, 2014, 05:52
Default
  #28
New Member
 
Jim KIT
Join Date: Aug 2012
Location: Germany
Posts: 25
Rep Power: 14
sharifi is on a distinguished road
Hallo,

thanks for sharing.
I'm doing a parallel simulation in OpenFoam and because of the limitation of files numbers in my computer, I have to prevent, that data be increased exponentially.
I have somehow no experience in bash scripting and I want to write a script, that after a certains time lets say 1 min search if there is a new time in processor0. It should than reconstract this new time and delete it in all the processors. Its clear that it shoud be able to run during the simulation.

I would appreciate if sombody can help me

thx
sharifi is offline   Reply With Quote

Old   December 7, 2014, 11:56
Default
  #29
New Member
 
Jaap Stolk
Join Date: Nov 2014
Posts: 11
Rep Power: 12
jwstolk is on a distinguished road
Dear Will,

I'm very happy with this script, but I would like to point out that there still is a small overlap in the detected time ranges when decimals are involved:

Starting Job 1 - reconstructing time = 0.25 through 30.5
Starting Job 2 - reconstructing time = 30.25 through 60.5

(I can only run 2 reconstructs in parallel on a single machine (16 GB ram limit)
but the script parameters make it very easy to manually divide the reconstructing over 2 or 3 machines.)

edit:
this example is a bit more problematic, I may have to run 30.5 manually:
Starting Job 1 - reconstructing time = 0.25 through 30.25
Starting Job 2 - reconstructing time = 30.75 through 61

Last edited by jwstolk; December 7, 2014 at 12:45. Reason: (another example)
jwstolk is offline   Reply With Quote

Old   December 14, 2014, 22:53
Default
  #30
Member
 
ALLEN
Join Date: Aug 2014
Posts: 32
Rep Power: 12
allenfieldin is on a distinguished road
Quote:
Originally Posted by jwstolk View Post
Dear Will,

I'm very happy with this script, but I would like to point out that there still is a small overlap in the detected time ranges when decimals are involved:

Starting Job 1 - reconstructing time = 0.25 through 30.5
Starting Job 2 - reconstructing time = 30.25 through 60.5

(I can only run 2 reconstructs in parallel on a single machine (16 GB ram limit)
but the script parameters make it very easy to manually divide the reconstructing over 2 or 3 machines.)

edit:
this example is a bit more problematic, I may have to run 30.5 manually:
Starting Job 1 - reconstructing time = 0.25 through 30.25
Starting Job 2 - reconstructing time = 30.75 through 61

hello, jwstolk

I am using the tool on HPC, but can tell me how to tun this application, I have put the script in the directory where pro* files are stored. and when I run "parReconstructPar 12",it simply give me a error of "Command not found".

Quote:
[atlas7-c01]$ parReconstructPar 12
bash: parReconstructPar: command not found
[atlas7-c01]$ .parReconstructPar
bash: .parReconstructPar: command not found
[atlas7-c01]$ ./parReconstructPar
bash: ./parReconstructPar: Permission denied
[atlas7-c01]$ sh parReconstructPar

K. Wardle 6/22/09, modified by H. Stadler Dec. 2013, minor fix Will Bateman Sep 2014.
bash script to run reconstructPar in pseudo-parallel mode
by breaking time directories into multiple ranges


USAGE: parReconstructPar -n <NP> -f fields -o <OUTPUTFILE>
-f (fields) is optional, fields given in the form T,U,p; option is passed on to reconstructPar
-t (times) is optional, times given in the form tstart,tstop
-o (output) is optional

[atlas7-c01]$ parReconstructPar 12
bash: parReconstructPar: command not found
[atlas7-c01]$
I think something wrong with the use, can you help me out and tell me the exact usage running parReconstructPar, I am a new user for foam and HPC both.

very much appreciated if you can do any help.

/Allen
allenfieldin is offline   Reply With Quote

Old   December 15, 2014, 17:22
Default
  #31
New Member
 
Jaap Stolk
Join Date: Nov 2014
Posts: 11
Rep Power: 12
jwstolk is on a distinguished road
I don't know exactly what HPC is in this case, but I will assume this is some form of Linux.

You are on the right track.
./parReconstructPar should normally work, but since you downloaded the script, it is not marked as executable, and results in an error. if you run:
ls -l parReconstructPar
the "x" flag should be missing.
in that case, run something like:
chmod a+x parReconstructPar

(oh, and possibly read scripts you download from the internet before giving them executing privileges :-)

Normally you only need to use the -n option, like:
./parReconstructPar -n 12
(see the USAGE line in your quote)

Note that reconstructing takes quite a bit of RAM. I recommend just running first with 1 or 2 instead of 12, and then checking how much ram it ends up using (for example with top or htop), and then decide how many cases you can run in parallel without running out of ram.
The OS can swap other programs to disk, but the ram used by parReconstructPar is used continuously, and when even a small part of that needs to be swapped to disk, everything slows down to a crawl.

With my current cases, I can run upto "-n 3" with 16 GB ram.
Since my files are on an NFS drive, I can use the "-t start,end" option to process only half the time directories, and run the other half from another computer, with another 16 GB of ram.
(The standard reconstructPar tool can only rebuild all time directories, or a list of timestamps, and does not have the neat "start,end" option like this script.)

If you are using decimals in your saved timestamps, check that the script does not skip a timestep between the split time ranges, because bash has trouble with sorting numbers with decimals.
jwstolk is offline   Reply With Quote

Old   February 16, 2015, 00:35
Default reconstructPar in parallel using GNU Parallel with a bash one-liner
  #32
Member
 
Peter
Join Date: Feb 2015
Location: New York
Posts: 73
Rep Power: 11
opedrofunk is on a distinguished road
Hi All,

I didn't know there was a script for this - really nice. I usually just do this with a bash one-liner:

Code:
$ foamListTimes  -processor > log.foamTimes; awk 'NR%4==1' log.foamTimes | parallel --halt=0 -j8 reconstructPar -newTimes -time {}:
Explanation:

Code:
foamListTimes -processor
lists all the times in the processor0/ directory and are saved to a file called log.foamTimes

Code:
awk 'NR%4==1' log.foamTimes
reads every 4th line (change that to whatever number more-or-less evenly divides the number of times in to the number of processors you want to use) and pipes it to

Code:
parallel --halt=0 -j8 reconstructPar -newTimes -time {}:
which takes the piped input and divides it among -j8 processes (change to whatever you want) which each run reconstructPar starting at -time {}: and skipping any times that may have already been processed by another job - this is important because we use ":" after inserting the start time value {}. The --halt=0 flag tells GNU Parallel to continue if an error happens to occur.

Anyway, that's the solution I've been using - hope this helps.
Peter
opedrofunk is offline   Reply With Quote

Old   July 19, 2015, 22:25
Default
  #33
Member
 
methma Rajamuni
Join Date: Jul 2015
Location: Victoria, Australia
Posts: 40
Rep Power: 11
meth is on a distinguished road
Kwardly,

Thank you very much for sharing the parReconstructPar script. It is perfectly working.

Best,

Meth.
meth is offline   Reply With Quote

Old   August 28, 2016, 06:07
Default
  #34
Senior Member
 
Taher Chegini
Join Date: Nov 2014
Location: Houston, Texas
Posts: 125
Rep Power: 13
Taataa is on a distinguished road
Thanks Peter. Your one-line code works like a charm, concise and efficient.
Taataa is offline   Reply With Quote

Old   May 12, 2017, 10:24
Default
  #35
Member
 
Ran
Join Date: Aug 2016
Posts: 69
Rep Power: 10
random_ran is on a distinguished road
Thanks, it real helps.

It looks good, but I've notice that something possible a bug?

This is my input

$ sh reconPar 24 test
running reconstructPar -noZero in pseudo-parallel mode on 24 processors
reconstructing 134 time directories
making temp dir
Starting Job 1 - reconstructing time = 0 through 10.5
Starting Job 2 - reconstructing time = 10.8 through 1.2
Starting Job 3 - reconstructing time = 12.3 through 13.8
Starting Job 4 - reconstructing time = 14.1 through 15.3
Starting Job 5 - reconstructing time = 15.6 through 17.1
Starting Job 6 - reconstructing time = 17.4 through 18.6
Starting Job 7 - reconstructing time = 18.9 through 20.4
Starting Job 8 - reconstructing time = 20.7 through 21.9
Starting Job 9 - reconstructing time = 22.2 through 23.7
Starting Job 10 - reconstructing time = 24 through 25.2
Starting Job 11 - reconstructing time = 25.5 through 27
Starting Job 12 - reconstructing time = 2.7 through 28.5
Starting Job 13 - reconstructing time = 28.8 through 30
Starting Job 14 - reconstructing time = 30.3 through 31.8
Starting Job 15 - reconstructing time = 32.1 through 33.3
Starting Job 16 - reconstructing time = 33.6 through 35.1
Starting Job 17 - reconstructing time = 35.4 through 36.6
Starting Job 18 - reconstructing time = 36.9 through 38.4
Starting Job 19 - reconstructing time = 38.7 through 39.9
Starting Job 20 - reconstructing time = 4.2 through 5.7
Starting Job 21 - reconstructing time = 6 through 7.5
Starting Job 22 - reconstructing time = 7.8 through 9.3
Starting Job 23 - reconstructing time = 9.6 through
Starting Job 24 - reconstructing time = through 39.9


===============================================

what does it mean for Job2 : 10.8 though 1.2? Is this a bug?
Starting Job 2 - reconstructing time = 10.8 through 1.2

Also notice that
Starting Job 23 - reconstructing time = 9.6 through
Starting Job 24 - reconstructing time = through 39.9

I can not understand what those two lines mean? It only deals with the timeStamp of 9.6 and 39.9?



After a chunk of time, the program stuck at some timeStamp. Here's the output from this script after running a long time.

sh reconPar 24 test
running reconstructPar -noZero in pseudo-parallel mode on 24 processors
reconstructing 134 time directories
making temp dir
Starting Job 1 - reconstructing time = 0 through 10.5
Starting Job 2 - reconstructing time = 10.8 through 1.2
Starting Job 3 - reconstructing time = 12.3 through 13.8
Starting Job 4 - reconstructing time = 14.1 through 15.3
Starting Job 5 - reconstructing time = 15.6 through 17.1
Starting Job 6 - reconstructing time = 17.4 through 18.6
Starting Job 7 - reconstructing time = 18.9 through 20.4
Starting Job 8 - reconstructing time = 20.7 through 21.9
Starting Job 9 - reconstructing time = 22.2 through 23.7
Starting Job 10 - reconstructing time = 24 through 25.2
Starting Job 11 - reconstructing time = 25.5 through 27
Starting Job 12 - reconstructing time = 2.7 through 28.5
Starting Job 13 - reconstructing time = 28.8 through 30
Starting Job 14 - reconstructing time = 30.3 through 31.8
Starting Job 15 - reconstructing time = 32.1 through 33.3
Starting Job 16 - reconstructing time = 33.6 through 35.1
Starting Job 17 - reconstructing time = 35.4 through 36.6
Starting Job 18 - reconstructing time = 36.9 through 38.4
Starting Job 19 - reconstructing time = 38.7 through 39.9
Starting Job 20 - reconstructing time = 4.2 through 5.7
Starting Job 21 - reconstructing time = 6 through 7.5
Starting Job 22 - reconstructing time = 7.8 through 9.3
Starting Job 23 - reconstructing time = 9.6 through
Starting Job 24 - reconstructing time = through 39.9


--> FOAM FATAL ERROR:
No times selected

From function int main(int, char**)
in file reconstructPar.C at line 225.

FOAM exiting

134 directories remaining...
112 directories remaining...
110 directories remaining...
108 directories remaining...
105 directories remaining...
100 directories remaining...
89 directories remaining...
88 directories remaining...
87 directories remaining...
85 directories remaining...
83 directories remaining...
79 directories remaining...
77 directories remaining...
73 directories remaining...
63 directories remaining...
61 directories remaining...
58 directories remaining...
56 directories remaining...
54 directories remaining...
52 directories remaining...
49 directories remaining...
39 directories remaining...
37 directories remaining...
35 directories remaining...
33 directories remaining...
32 directories remaining...
28 directories remaining...
24 directories remaining...
20 directories remaining...
19 directories remaining...
17 directories remaining...
15 directories remaining...
11 directories remaining...
10 directories remaining...
9 directories remaining...
8 directories remaining...
7 directories remaining...
6 directories remaining...

E.O.F

I am using O.F. v4.1 with the flowing hardware:

24 cores/node, Memory per node 32 G, infiniband, AMD @2.1GHz CPU, Centos 6.8.



Third update: After 01:43:43 running, it finish without error. Thanks man!

But according to my observation, the final several directories were much slower than others. Anybody has ideas about this?

This is full record of the output.

$ sh reconPar 24 test
running reconstructPar -noZero in pseudo-parallel mode on 24 processors
reconstructing 134 time directories
making temp dir
Starting Job 1 - reconstructing time = 0 through 10.5
Starting Job 2 - reconstructing time = 10.8 through 1.2
Starting Job 3 - reconstructing time = 12.3 through 13.8
Starting Job 4 - reconstructing time = 14.1 through 15.3
Starting Job 5 - reconstructing time = 15.6 through 17.1
Starting Job 6 - reconstructing time = 17.4 through 18.6
Starting Job 7 - reconstructing time = 18.9 through 20.4
Starting Job 8 - reconstructing time = 20.7 through 21.9
Starting Job 9 - reconstructing time = 22.2 through 23.7
Starting Job 10 - reconstructing time = 24 through 25.2
Starting Job 11 - reconstructing time = 25.5 through 27
Starting Job 12 - reconstructing time = 2.7 through 28.5
Starting Job 13 - reconstructing time = 28.8 through 30
Starting Job 14 - reconstructing time = 30.3 through 31.8
Starting Job 15 - reconstructing time = 32.1 through 33.3
Starting Job 16 - reconstructing time = 33.6 through 35.1
Starting Job 17 - reconstructing time = 35.4 through 36.6
Starting Job 18 - reconstructing time = 36.9 through 38.4
Starting Job 19 - reconstructing time = 38.7 through 39.9
Starting Job 20 - reconstructing time = 4.2 through 5.7
Starting Job 21 - reconstructing time = 6 through 7.5
Starting Job 22 - reconstructing time = 7.8 through 9.3
Starting Job 23 - reconstructing time = 9.6 through
Starting Job 24 - reconstructing time = through 39.9


--> FOAM FATAL ERROR:
No times selected

From function int main(int, char**)
in file reconstructPar.C at line 225.

FOAM exiting

134 directories remaining...
112 directories remaining...
110 directories remaining...
108 directories remaining...
105 directories remaining...
100 directories remaining...
89 directories remaining...
88 directories remaining...
87 directories remaining...
85 directories remaining...
83 directories remaining...
79 directories remaining...
77 directories remaining...
73 directories remaining...
63 directories remaining...
61 directories remaining...
58 directories remaining...
56 directories remaining...
54 directories remaining...
52 directories remaining...
49 directories remaining...
39 directories remaining...
37 directories remaining...
35 directories remaining...
33 directories remaining...
32 directories remaining...
28 directories remaining...
24 directories remaining...
20 directories remaining...
19 directories remaining...
17 directories remaining...
15 directories remaining...
11 directories remaining...
10 directories remaining...
9 directories remaining...
8 directories remaining...
7 directories remaining...
6 directories remaining...
5 directories remaining...
4 directories remaining...
3 directories remaining...
2 directories remaining...
1 directories remaining...
cleaning up temp files
finished

E.O.F

Last edited by random_ran; May 12, 2017 at 11:58. Reason: Update output:3rd
random_ran is offline   Reply With Quote

Old   May 30, 2017, 15:09
Default
  #36
Member
 
Ran
Join Date: Aug 2016
Posts: 69
Rep Power: 10
random_ran is on a distinguished road
I found the creator [O. Tange (2011)] of GNU parallel was funny.

To silence the citation notice: run 'parallel --bibtex'.

That's a good way to remind users. Anyway, thanks.

Quote:
Originally Posted by opedrofunk View Post
Hi All,

I didn't know there was a script for this - really nice. I usually just do this with a bash one-liner:

Code:
$ foamListTimes  -processor > log.foamTimes; awk 'NR%4==1' log.foamTimes | parallel --halt=0 -j8 reconstructPar -newTimes -time {}:
Explanation:

Code:
foamListTimes -processor
lists all the times in the processor0/ directory and are saved to a file called log.foamTimes

Code:
awk 'NR%4==1' log.foamTimes
reads every 4th line (change that to whatever number more-or-less evenly divides the number of times in to the number of processors you want to use) and pipes it to

Code:
parallel --halt=0 -j8 reconstructPar -newTimes -time {}:
which takes the piped input and divides it among -j8 processes (change to whatever you want) which each run reconstructPar starting at -time {}: and skipping any times that may have already been processed by another job - this is important because we use ":" after inserting the start time value {}. The --halt=0 flag tells GNU Parallel to continue if an error happens to occur.

Anyway, that's the solution I've been using - hope this helps.
Peter
random_ran is offline   Reply With Quote

Old   June 8, 2018, 20:09
Default Another minor fix to get all time steps in order
  #37
New Member
 
Guilherme Salvador Vieira
Join Date: Jun 2018
Location: Boston, MA
Posts: 1
Rep Power: 0
gsalvador is on a distinguished road
Dear all,

I just wanted to upload a small fix that handles situations in which Will's last version still struggles (e.g. if you have outputs at times 0.125, 0.25, 0.375 and 0.5, when the proposed "ls -v1" by itself doesn't capture well the order).

The idea is simply to replace the occurrences of "ls processor0 -1v | ..." with "ls processor0 -1v | sort -g | ...", which guarantees the ordering is correct regardless of how many decimal digits are used in different folders.

Regardless, this is a great script, very useful to get a faster reconstruction. Thanks to all those who contributed.
Attached Files
File Type: txt parReconstructPar.txt (5.3 KB, 160 views)
alexfells and LouiPh_Qc like this.
gsalvador is offline   Reply With Quote

Old   April 13, 2019, 13:16
Default
  #38
Member
 
Nat K
Join Date: Oct 2017
Posts: 68
Rep Power: 9
nskelly is on a distinguished road
Is it possible to reconstruct specific timestamps using -t option?
nskelly is offline   Reply With Quote

Old   April 14, 2019, 03:35
Default
  #39
New Member
 
Jaap Stolk
Join Date: Nov 2014
Posts: 11
Rep Power: 12
jwstolk is on a distinguished road
Quote:
Originally Posted by nskelly View Post
Is it possible to reconstruct specific timestamps using -t option?

This script assigns different reconstruct jobs to different threads. I have not used it for a while but if you set the time range to include only a single timestemp, the script will only use a single thread, and should be identical to just running "reconstructPar -time x.xx"


I now mostly use the ParaFoam option to visualize a decomposed case, without the need for reconstructing.
jwstolk is offline   Reply With Quote

Old   July 1, 2020, 10:13
Default
  #40
New Member
 
Wei Yao
Join Date: Jul 2015
Posts: 1
Rep Power: 0
weiyao is on a distinguished road
Very usefull scripts !
weiyao is offline   Reply With Quote

Reply

Tags
parallel processing, reconstructpar


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Running decomposePar / reconstructPar as parallel apps? carcass OpenFOAM Running, Solving & CFD 3 January 17, 2024 08:19
Script to Run Parallel Jobs in Rocks Cluster asaha OpenFOAM Running, Solving & CFD 12 July 4, 2012 23:51
Core usage on CFX parallel processing alterego CFX 6 December 21, 2011 06:45
Swap usage on parallel run nikhilesh OpenFOAM Running, Solving & CFD 0 April 30, 2009 10:50
TASCflow,problem with script and parallel mode Zbynek Hrncir CFX 0 October 2, 2001 08:30


All times are GMT -4. The time now is 22:11.