|
[Sponsors] |
December 24, 2012, 11:06 |
|
#21 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Hi Chris,
Many thanks for the feedback and my apologies for not having tested this properly back then . The fix for you should be to simply run the respective commands in the new step #3: http://openfoamwiki.net/index.php/In...ian#Debian_6.0 By following the new commands, they will write the necessary custom variables into the "etc/prefs.sh" file, therefore making it the standard settings, therefore making it properly read in the remote shells. And don't forget to run them from the base OpenFOAM folder, namely "$HOME/OpenFOAM". The previous problem is now described here (if you're curious about it ): http://www.openfoam.org/mantisbt/view.php?id=231#c1846 - comment #1846 Best regards, Bruno
__________________
|
|
January 6, 2013, 01:17 |
prefs.sh
|
#22 |
New Member
Chris Fisichella
Join Date: Oct 2012
Posts: 28
Rep Power: 14 |
Hi Bruno,
Thanks for suggesting that fix. I have two machines that use the new instructions. They both have OpenFOAM-2.1.1/etc/prefs.sh. One is a 64 bit machine and the other is a 32 bit machine. The two machines are referred to as 192.168.1.20 and 192.168.1.21. The .20 machine is the master and the .21 is the slave. Both have two cores. I can't seem to get this work: Code:
foamJob -p -s Test-parallel in the laminar directory, I created: laminar/node0 laminar/node1 laminar/node2 laminar/node3 I modified the damBreak example in the following way: 1. I edited decomposeParDict to request distributed computation. I am attaching that file. I see sometimes after 'root' some documentation says nroots, I noticed you don't have that in yours, but I tried both. I removed a '3' that was previously after roots 2. I copied alpha1.org to alpha1 3. setFields 4. blockMesh 5. decomposePar 6. created a machines file. I'll attach that, too. 7. Each node? directory got a copy of the modified damBreak flle set. 8. I temporarily renamed machines to machines.bak and the following works: Code:
foamJob -p -s Test-parallel Code:
$ foamJob -p -s Test-parallel Parallel processing using SYSTEMOPENMPI with 4 processors Executing: /usr/bin/mpirun -np 4 /home/fisichel/OpenFOAM/OpenFOAM-2.1.1/bin/foamExec -prefix /home/fisichel/OpenFOAM Test-parallel -parallel | tee log /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2.1.1 | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : 2.1.1-221db2718bbb Exec : Test-parallel -parallel Date : Jan 05 2013 Time : 23:04:43 Host : "debian-6-0-3" PID : 9323 Case : /home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node0/damBreak nProcs : 4 Slaves : 3 ( "debian-6-0-3.9324" "debian-6-0-3.9326" "debian-6-0-3.9330" ) Roots : 3 ( "/home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node1" "/home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node2" "/home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node3" ) Pstream initialized with: floatTransfer : 0 nProcsSimpleSum : 0 commsType : nonBlocking sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster allowSystemOperations : Disallowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time [0] [1] Starting transfers [1] [1] slave sending to master 0 [1] slave receiving from master 0 Starting transfers$ foamJob -p -s Test-parallel Parallel processing using SYSTEMOPENMPI with 4 processors Executing: /usr/bin/mpirun -np 4 /home/fisichel/OpenFOAM/OpenFOAM-2.1.1/bin/foamExec -prefix /home/fisichel/OpenFOAM Test-parallel -parallel | tee log /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2.1.1 | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : 2.1.1-221db2718bbb Exec : Test-parallel -parallel Date : Jan 05 2013 Time : 23:04:43 Host : "debian-6-0-3" PID : 9323 Case : /home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node0/damBreak nProcs : 4 Slaves : 3 ( "debian-6-0-3.9324" "debian-6-0-3.9326" "debian-6-0-3.9330" ) Roots : 3 ( "/home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node1" "/home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node2" "/home/fisichel/OpenFOAM/fisichel-2.1.1/run/tutorials/multiphase/interFoam/laminar/node3" ) Pstream initialized with: floatTransfer : 0 nProcsSimpleSum : 0 commsType : nonBlocking sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster allowSystemOperations : Disallowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time [0] [1] Starting transfers [1] [1] slave sending to master 0 [1] slave receiving from master 0 Starting transfers [0] [0] master receiving from slave 1 [0] (0 1 2) [0] master receiving from slave 2 [2] [3] Starting transfers [3] [3] slave sending to master 0 [3] slave receiving from master 0 Starting transfers [2] [2] slave sending to master 0 [2] slave receiving from master 0 [0] (0 1 2) [0] master receiving from slave 3 [0] (0 1 2) [0] master sending to slave 1 [0] master sending to slave 2 [0] master sending to slave 3 End [2] [3] (0 1 2) Finalising parallel run [1] ((00 11 22)) [0] [0] master receiving from slave 1 [0] (0 1 2) [0] master receiving from slave 2 [2] [3] Starting transfers [3] [3] slave sending to master 0 [3] slave receiving from master 0 Starting transfers [2] [2] slave sending to master 0 [2] slave receiving from master 0 [0] (0 1 2) [0] master receiving from slave 3 [0] (0 1 2) [0] master sending to slave 1 [0] master sending to slave 2 [0] master sending to slave 3 End [2] [3] (0 1 2) Finalising parallel run [1] ((00 11 22)) When I rename machines.bak to machines, foamJob picks it up and I get the following error: Code:
$ foamJob -p -s Test-parallel Parallel processing using SYSTEMOPENMPI with 4 processors Executing: /usr/bin/mpirun -np 4 -hostfile machines /home/fisichel/OpenFOAM/OpenFOAM-2.1.1/bin/foamExec -prefix /home/fisichel/OpenFOAM Test-parallel -parallel | tee log /*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: 2.1.1 | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : 2.1.1-221db2718bbb Exec : Test-parallel -parallel Date : Jan 05 2013 Time : 23:56:28 Host : "debian-6-0-3" PID : 10354 [2] [2] [2] --> FOAM FATAL IO ERROR: [2] Expected a ')' or a '}' while reading List, found on line 0 an error [2] [2] file: IOstream at line 0. [2] [2] From function Istream::readEndList(const char*) [2] in file db/IOstreams/IOstreams/Istream.C at line 159. [2] FOAM parallel run exiting -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD with errorcode 1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [2] [3] [3] -------------------------------------------------------------------------- mpirun has exited due to process rank 2 with PID 5792 on node 192.168.1.21 exiting without calling "finalize". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here). -------------------------------------------------------------------------- [3] --> FOAM FATAL IO ERROR: [3] Expected a ')' or a '}' while reading List, found on line 0 an error [3] [3] file: IOstream at line 0. [3] [3] From function Istream::readEndList(const char*) [3] in file db/IOstreams/IOstreams/Istream.C at line 159. [3] FOAM parallel run exiting [3] [debian-6-0-3:10350] 1 more process has sent help message help-mpi-api.txt / mpi-abort [debian-6-0-3:10350] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages a. I open a root terminal on the slave machine via Gnome, and type 'which Test-parallel', and it points to the correct one. I think the environment is there. b. The node2 and node3 directories were created from the mount command. They actually reside on the slave machine (.21), but I made them part of the master file system to more accurately model your setup. The settings for the mount command were (rw,sync,no_subtree_check) c.I did have to make changes to etc/bashrc that went beyond what the directions recommends for end users. I need to remove where root was being inserted into the path. I replaced it with my username. d. I googled the problem and it appears the solution is to go back to openmpi/mpirun 1.4.3. (mantisbt 296) I think I am misapplying that patch, however. I don't see why I should be getting that error when I introduce a machines file. Any ideas? Your thoughts would be appreciated. Thanks, Chris 99932-compilation-error-openfoam-2-1-x-decomposeParDict.tar.gz machines.tar.gz globalbash.tar.gz |
|
January 6, 2013, 15:26 |
|
#23 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Hi Chris,
these are a lot of tests and files... I can't figure out from your post if you tested using the two machines with your own user and without using the distributed folders method. I don't have much time to go into details, so I'll try to summarize the suggestions:
Best regards, Bruno
__________________
|
|
January 6, 2013, 15:54 |
|
#24 | ||||||
New Member
Chris Fisichella
Join Date: Oct 2012
Posts: 28
Rep Power: 14 |
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
|
|||||||
January 6, 2013, 16:26 |
|
#25 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Hi Chris,
A quick reply:
Bruno
__________________
|
|
January 6, 2013, 16:59 |
|
#26 | |||
New Member
Chris Fisichella
Join Date: Oct 2012
Posts: 28
Rep Power: 14 |
Hi Bruno,
A quick response! Quote:
Quote:
Quote:
Thanks, Chris Last edited by fisichel; January 6, 2013 at 18:41. Reason: typo |
||||
January 13, 2013, 21:42 |
|
#27 |
New Member
Chris Fisichella
Join Date: Oct 2012
Posts: 28
Rep Power: 14 |
Hi Bruno,
I tried your ideas. Your notes are very useful. There were a couple of links that were talking about dealing with non-interactive shells, and that was one my problems. For a default Debian installation, I had to go into my .bashrc and comment out the line that exits if it detects a non-interactive shell. Also, I uninstalled the SYSTEMOPENMPI software, and brought in OpenMPI 1.6.3. That is their latest release and they are claiming application binary interface compatibility with 1.5.3. I went into the OpenFOAM file OpenFOAM-2.1.1/etc/config/settings.sh and changed 1.5.3 to 1.6.3 to make sure I included all of OpenFOAM's requirements for the builds, notably GridEngine. (i.e. I used ./Allwmake) I am hoping GridEngine is supported in 1.6.3 as I noticed it is not working in OpenMPI 1.5.3. I setup a hosts file for the two machines: 192.168.1.25 debian5 192.168.1.26 debian6 I did not try a heterogeneous calculation, so I did everything on two nearly identical 64 bit machines. Finally, I copied the folder structure you outline in "Running OpenFOAM in parallel with different locations for each process" ...and it works. Your debian wiki was instrumental in getting this to work. I propose the following changes: 1. Update the .bashrc file to properly handle non-interactive logins, unless you have a more clever solution. 2. Build openMPI 1.6.3 since the default Debian installation does not have GridEngine support 3. setup a hosts file If you comfortable with these updates, I will be happy to make them. Best Regards, Chris |
|
January 14, 2013, 04:59 |
|
#28 |
Retired Super Moderator
Bruno Santos
Join Date: Mar 2009
Location: Lisbon, Portugal
Posts: 10,981
Blog Entries: 45
Rep Power: 128 |
Hi Chris,
Good to learn that you've managed to get things working! As for changing the wiki page, points 1 and 2 are OK for that Debian page and feel free to go ahead and do the modifications! Point 3 should be part of a page that is yet to be created, about running OpenFOAM in parallel. If you're willing to kick-start such a page, you can create a tips-n-tricks page: http://openfoamwiki.net/index.php/Main_TipsAndTricks - later on it can be moved to a more global section, when more contributions start coming in! Best regards, Bruno
__________________
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
OpenFOAM compilation | MrAnderson | OpenFOAM | 3 | October 16, 2017 19:04 |
OpenFOAM 1.6.x, 1.7.0 and 1.7.x are not fully prepared to work with gcc-4.5.x | wyldckat | OpenFOAM Bugs | 18 | October 21, 2010 06:51 |
Cross-compiling OpenFOAM 1.6 on Linux for Windows 32 and 64bits with Mingw-w64 | wyldckat | OpenFOAM Announcements from Other Sources | 7 | January 19, 2010 16:39 |
OpenFOAM Training in Europe and USA | hjasak | OpenFOAM | 0 | August 8, 2008 06:33 |
OpenFOAM Debian packaging current status problems and TODOs | oseen | OpenFOAM Installation | 9 | August 26, 2007 14:50 |