|
[Sponsors] |
November 7, 2007, 09:42 |
Ok, I have configured the .ssh
|
#21 |
New Member
Christofer Ivarsson
Join Date: Mar 2009
Posts: 21
Rep Power: 17 |
Ok, I have configured the .ssh with a 'authorization_keys' file and I may now ssh into the server node without password. Though, not in the reverse direction for some reason. When I ssh into the server all the OpenFOAM env-variables is set automatically via the .bashrc so seem to work properly. Still, the mpirun never starts after executing
$OPENMPI_ARCH_PATH/bin/mpirun --hostfile machines -np 4 simpleFOAM $HOME VAFAB_multi -paralell nothing happends The /bin/true exists but bash: bin/true: can not find file Still I believe some environment files are missing... /C |
|
November 7, 2007, 10:38 |
The /bin/true exists but
|
#22 | ||
Senior Member
Mark Olesen
Join Date: Mar 2009
Location: https://olesenm.github.io/
Posts: 1,714
Rep Power: 40 |
Quote:
-bash: bin/true: No such file or directory With '/bin/true' however, it works fine. Quote:
Check that it works on the same machine: $OPENMPI_ARCH_PATH/bin/mpirun -np 4 /bin/hostname Add '--debug-daemons' and see what you find. It might be time to find someone closer to your location (eg sysadmin) and have them take a look. |
|||
November 7, 2007, 11:52 |
Sysadmins whitin Linux only ex
|
#23 |
New Member
Christofer Ivarsson
Join Date: Mar 2009
Posts: 21
Rep Power: 17 |
Sysadmins whitin Linux only exists i heaven, WINDOWS have bewitched em all...but thanks for all your support!
/Christofer |
|
March 11, 2008, 12:24 |
Hi!
I have an execution para
|
#24 |
Member
merrouche djemai
Join Date: Mar 2009
Location: ain-oussera, djelfa, algeria
Posts: 46
Rep Power: 17 |
Hi!
I have an execution parallel problem with openmpi. I run an interfoam case using two Pcs under FEDORA 7 and I get some errors during execution, both Pcs have the case and I search in openmpi forum and the recommandations doesn't work for my case. Here are the details:* - For the eth0 in PC1 /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:1D:92:09:A9:BE inet addr:192.168.0.2 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::21d:92ff:fe09:a9be/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:378 errors:0 dropped:0 overruns:0 frame:0 TX packets:92 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:63038 (61.5 KiB) TX bytes:18064 (17.6 KiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:14355 errors:0 dropped:0 overruns:0 frame:0 TX packets:14355 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:90409124 (86.2 MiB) TX bytes:90409124 (86.2 MiB) virbr0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:40 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:8700 (8.4 KiB) - For the eth0 in PC2 /sbin/ifconfig eth0 Link encap:Ethernet HWaddr 00:00:21:0B:C6:2B inet addr:192.168.0.3 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::200:21ff:fe0b:c62b/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:113 errors:0 dropped:0 overruns:0 frame:0 TX packets:105 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:17546 (17.1 KiB) TX bytes:19208 (18.7 KiB) Interrupt:5 Base address:0x4000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:19960 errors:0 dropped:0 overruns:0 frame:0 TX packets:19960 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:142422400 (135.8 MiB) TX bytes:142422400 (135.8 MiB) virbr0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:35 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:8390 (8.1 KiB) - for the execution: (1) [mer@merrouche2 ~]$ mpirun --mca pls_rsh_agent "ssh : rsh" --hostfile /home/mer/machinefile -np 2 interFoam /home/mer/OpenFOAM/mer-1.4.1/run/tutorials/interFoam case_2 -parallel MPI Pstream initialized with: floatTransfer : 1 nProcsSimpleSum : 0 scheduledTransfer : 0 /*---------------------------------------------------------------------------*\ | ========= | | | \ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \ / O peration | Version: 1.4.1 | | \ / A nd | Web: http://www.openfoam.org | | \/ M anipulation | | \*---------------------------------------------------------------------------*/ Exec : interFoam /home/mer/OpenFOAM/mer-1.4.1/run/tutorials/interFoam case_2 -parallel [0] Date : Mar 11 2008 [0] Time : 15:17:23 [0] Host : merrouche2 [0] PID : 4576 [1] Date : Mar 11 2008 [0] Root : /home/mer/OpenFOAM/mer-1.4.1/run/tutorials/interFoam [0] Case : case_2 [0] Nprocs : 2 [0] Slaves : [0] 1 [0] ( [0] merrouche3.3216 [0] ) [0] Create time [1] Time : 15:20:19 [1] Host : merrouche3 [1] PID : 3216 [1] Root : /home/mer/OpenFOAM/mer-1.4.1/run/tutorials/interFoam [1] Case : case_2 [1] Nprocs : 2 [merrouche2][0,1,0][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_comple te_connect ] connect() failed with errno=111 [merrouche3][0,1,1][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_comple te_connect ] connect() failed with errno=111 (2) [mer@merrouche2 ~]$ mpirun --mca pls_rsh_agent "rsh : ssh" --hostfile /home/mer/machinefile -np 2 interFoam /home/mer/OpenFOAM/mer-1.4.1/run/tutorials/interFoam case_2 -parallel merrouche3: Connection refused [merrouche2:05257] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275 [merrouche2:05257] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1164 [merrouche2:05257] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line 90 [merrouche2:05257] ERROR: A daemon on node merrouche3 failed to start as expected. [merrouche2:05257] ERROR: There may be more information available from [merrouche2:05257] ERROR: the remote shell (see above). [merrouche2:05257] ERROR: The daemon exited unexpectedly with status 1. [merrouche2:05257] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 188 [merrouche2:05257] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1196 -------------------------------------------------------------------------- mpirun was unable to cleanly terminate the daemons for this job. Returned value Timeout instead of ORTE_SUCCESS. Please help me |
|
March 11, 2008, 14:06 |
Merrouche,
Do you have any
|
#25 |
Senior Member
Michael Jaworski
Join Date: Mar 2009
Location: Champaign, IL, USA
Posts: 126
Rep Power: 17 |
Merrouche,
Do you have any firewalls running on either machine? My understanding of openMPI is that it doesn't make use of a specific range of ports so it's not possible to configure a firewall to use it. Or if there is, no one seems to know how. I'd suggest testing this by disabling your firewalls and trying mpirun again. Good Luck, Mike J. |
|
March 11, 2008, 15:51 |
Hi Mike!
Thanks for your quic
|
#26 |
Member
merrouche djemai
Join Date: Mar 2009
Location: ain-oussera, djelfa, algeria
Posts: 46
Rep Power: 17 |
Hi Mike!
Thanks for your quick response. The firewalls are desabled and I can ssh to machines. Also, my case works on each machine alone using mpirun. Another informations: - if I type ompi_info : the mca pls_rsh_agent is rsh and I can't rsh to machines, for this reason there is the second case : "merrouche3: Connection refused" - but I can ssh to machines and I ask if my expression: --mca pls_rsh_agent "ssh : rsh" is correct and force openmpi to use ssh. ThankS |
|
March 15, 2008, 08:30 |
Hi!
Is there any one who can
|
#27 |
Member
merrouche djemai
Join Date: Mar 2009
Location: ain-oussera, djelfa, algeria
Posts: 46
Rep Power: 17 |
Hi!
Is there any one who can help me to resolve my problem. Until now, I can't run my execution using mpirun under OF1.4.1. The errors are the same as mentionned in my previous posts. I try the hint posted in the forum without any success. I create a new user and install a new OF-1.4.1 version, but the problem remains. Is there any relation with LAM/MPI and OpenMPI? I ask this question because I installed a lam-7.1.3 version on my pcs before installing OF -1.4.1. I really NEED your helps. THANKS |
|
April 20, 2009, 16:55 |
OpenMPI on thin client
|
#28 |
New Member
Dominic Spreitz
Join Date: Mar 2009
Location: Lucern, Switzerland
Posts: 10
Rep Power: 17 |
Hi forum,
Since I have a problem with my OpenMPI and OF 1.5 .... I read lots of posts in this forum and sorted out a lot of problems on the way (thanks to all the helpful advices already posted in this forum) , but now I am really stuck. As a little background, I'll give you the big picture of my intentions: Over night & weekend I want to use the CAD workstations in office to run OF in parallel. Since I can't install anything locally on these workstations, I have to use them as thin-clients and boot them over the GBIT-network (pxe, dhcp, atftp, nfs, ...). On the server, that supplies the boot image (derived from pelicanHPC) to the clients I run Ubuntu and installed OF 1.5 with Jure's script. At this point I can successfully boot one client over network and then mount the server's home directory, where OF resides using NFS. When I connect to the client using ssh and the passwordless login, I can run icoFoam and the cavity case on the client. I can run the same case on the server too. To check the openMPI installation, I ran the 'hello world' example that Mark suggested in my home directory (which is also mounted on the client) on the server by typing: wget http://icl.cs.utk.edu/open-mpi/paper...p-2006/hello.c mpicc hello.c -o hello mpirun -np 4 ./hello which resulted in: Hello, World. I am 0 of 4 Hello, World. I am 1 of 4 Hello, World. I am 2 of 4 Hello, World. I am 3 of 4 To check if I can run the hello world example on server and client I tried: mpirun -np 4 -host 10.11.12.1 -host 10.11.12.2 ./hello and got the same result as posted above. I verified, that the example actually ran on both machines by specifying the -d option in the above command. So far so good. OF seems to work, networking between the machines is working, mpi is running. However, if I go and try: mpirun -np 2 -hostfile machines icoFoam /home/user/OpenFOAM/user-1.5/run/tutorials/icoFoam/cavity -parallel > log_mpi_icoFOAM 2>&1 I get: cat log_mpi_icoFOAM -------------------------------------------------------------------------- Failed to find the following executable: Host: debian Executable: icoFoam Cannot continue. -------------------------------------------------------------------------- mpirun noticed that job rank 0 with PID 8624 on node 10.11.12.1 exited on signal 15 (Terminated). From the debug output of the above command (see attachment) I conclude, that all my environment variables are set correctly and also exported over mpi. Yet, parallel execution on the client fails, although I can run the same case on the client directly. Very strange to me. Did anybody experience a similar problem or can at least give me a hint or an idea where to start digging for a solution? Any help would be much appreciated. P.S.: I don't know if it has anything to do with my problem, but I attached the output of my ldd -v icoFoam, that I ran on the client and server. The libraries are different, but as I said before, icoFoam seems to run on the client if I launch it directly through ssh. |
|
April 20, 2009, 17:45 |
|
#29 |
New Member
Dominic Spreitz
Join Date: Mar 2009
Location: Lucern, Switzerland
Posts: 10
Rep Power: 17 |
Ok, I have to reply to myself:
My .bashrc file contained: [ -z "$PS1" ] && return which means it did not load any OF related environment variables at non-interactive (openMPI & SSH) login. By replacing this line with: if [ -z "$PS1" ]; then source /home/user/OpenFOAM/OpenFOAM-1.5/etc/bashrc fi I changed that behaviour. Now the necessary OF environment variables are loaded at non-interactive login and my mpi runs are executed correctly. Maybe this helps somebody else. Dominic |
|
May 14, 2009, 03:39 |
|
#30 |
New Member
|
Hello,
I try to run OF-1.5-dev in parallel without any success. I changed my bashrc like DSpreitz describes, I created the OF on all pc's, I installed NFS and so on ... When I try to run, for example icoFoam, the following message is in my terminal: michel@Linux-K:~/OpenFOAM/michel-1.5-dev/run/tutorials/icoFoam/cavity$ /home/michel/OpenFOAM/OpenFOAM-1.5-dev/applications/bin/linuxGccDPOpt/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory This error will every time come when I use the command: mpirun --hostfile <machines> -np 2 /home/michel/OpenFOAM/OpenFOAM-1.5-dev/applications/bin/linuxGccDPOpt/icoFoam $FOAM_RUN/tutorials/icoFoam cavity -parallel > log & but when i use this command: mpirun --hostfile <machines> -np 2 icoFoam $FOAM_RUN/tutorials/icoFoam cavity -parallel > log & the following error will appear mpirun --hostfile /home/michel/machines -np 2 icoFoam /home/michel/OpenFOAM/michel-1.5-dev/run/tutorials/icoFoam cavity -parallel > log-------------------------------------------------------------------------- Failed to find the following executable: Host: ubuntu Executable: interFoam Cannot continue. -------------------------------------------------------------------------- mpirun noticed that job rank 0 with PID 14312 on node 192.168.1.82 exited on signal 15 (Terminated) thats really strange because OF as I said is on the other pc too. However, I tried this command with the full path of mpirun /home/michel/OpenFOAM/ThirdParty/openmpi-1.2.6/platforms/linuxGccDPOpt/bin/mpirun --hostfile ... and so on and no error will appear but my pc do nothing??? Has anybody some suggestions for me. I need the parallel mode because computing of my cases will need more and more time. Thanks Best regards Michel |
|
May 14, 2009, 04:57 |
well
|
#31 |
New Member
|
I tried the following command:
/home/michel/OpenFOAM/OpenFOAM-1.5-dev/bin/foamExec -v 1.5-dev /home/michel/OpenFOAM/ThirdParty/openmpi-1.2.6/platforms/linuxGccDPOpt/bin/mpirun --hostfile /home/michel/machines -np 2 icoFoam /home/michel/OpenFOAM/michel-1.5-dev/run/tutorials/icoFoam cavity -parallel > log with the foamExec and this error appears [: 106: ==: unexpected operator [: 106: ==: unexpected operator [: 153: ==: unexpected operator [: 257: ==: unexpected operator [: 259: ==: unexpected operator [: 56: ==: unexpected operator [: 74: ==: unexpected operator and nothing will happens. which file will cause this error? and what can i do to solve it? Thank you again Michel |
|
May 14, 2009, 14:46 |
|
#32 |
New Member
Dominic Spreitz
Join Date: Mar 2009
Location: Lucern, Switzerland
Posts: 10
Rep Power: 17 |
Michael,
I don't know if it really changes great things, but the mpirun command that I use contains a -case statement. Here is the command: mpirun --hostfile ~/machines_pil -np 12 /home/user/OpenFOAM/OpenFOAM-1.5/applications/bin/linuxGccDPOpt/icoFoam -case /home/user/OpenFOAM/user-1.5/run/tutorials/icoFoam/cavity -paralleleverything in one line, obviously. Dominic |
|
May 15, 2009, 04:02 |
|
#33 |
New Member
|
Thanks Dominic for your reply,
i tried it with -case but the same error appears /home/michel/OpenFOAM/OpenFOAM-1.5-dev/applications/bin/linuxGccDPOpt/icoFoam: error while loading shared libraries: libfiniteVolume.so: cannot open shared object file: No such file or directory I think openmpi do not know the paths to all libraries. I tried to export them in the bashrc but it doesn't worked. maybe i do something wrong by adding the paths but i do not know. Thanks Michel |
|
May 15, 2009, 10:29 |
|
#34 |
New Member
|
I checked my $PATH and $LD_LIBRARY_PATH on every machine. both looks like this
$PATH /home/michel/OpenFOAM/ThirdParty/ParaView3.3-cvs/platforms/linuxGcc/bin:/home/michel/OpenFOAM/ThirdParty/cmake-2.4.6/platforms/linux/bin:/home/michel/OpenFOAM/ThirdParty/openmpi-1.2.6/platforms/linuxGccDPOpt/bin:/home/michel/OpenFOAM/ThirdParty/gcc-4.3.1/platforms/linux/bin:/home/michel/OpenFOAM/michel-1.5-dev/applications/bin/linuxGccDPOpt:/home/michel/OpenFOAM/OpenFOAM-1.5-dev/applications/bin/linuxGccDPOpt:/home/michel/OpenFOAM/OpenFOAM-1.5-dev/wmake:/home/michel/OpenFOAM/OpenFOAM-1.5-dev/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games $LD_LIBRARY_PATH /home/michel/OpenFOAM/ThirdParty/ParaView3.3-cvs/platforms/linuxGcc/bin:/home/michel/OpenFOAM/OpenFOAM-1.5-dev/lib/linuxGccDPOpt/openmpi-1.2.6:/home/michel/OpenFOAM/ThirdParty/openmpi-1.2.6/platforms/linuxGccDPOpt/lib:/home/michel/OpenFOAM/ThirdParty/gcc-4.3.1/platforms/linux/lib:/home/michel/OpenFOAM/michel-1.5-dev/lib/linuxGccDPOpt:/home/michel/OpenFOAM/OpenFOAM-1.5-dev/lib/linuxGccDPOpt I think that everything is ok but mpirun is still not working. Is the PATH and the LD_LIBRARY_PATH ok or do I have to add something in the bashrc? regards Michel |
|
May 20, 2009, 07:01 |
|
#35 |
New Member
|
I am still not able to run in parallel
Has nobody some ideas for me? Michel |
|
May 20, 2009, 09:36 |
|
#36 | |
Senior Member
Mark Olesen
Join Date: Mar 2009
Location: https://olesenm.github.io/
Posts: 1,714
Rep Power: 40 |
Quote:
With foamExec being called by mpirun, instead of vice-versa: Code:
# Can also be used for parallel runs e.g. # mpirun -np <nProcs> \ # foamExec -v <foamVersion> <foamCommand> ... -parallel |
||
May 20, 2009, 09:42 |
|
#37 | |
Senior Member
Mark Olesen
Join Date: Mar 2009
Location: https://olesenm.github.io/
Posts: 1,714
Rep Power: 40 |
Quote:
Code:
ssh $HOST 'echo $PATH; echo; echo $LD_LIBRARY_PATH' |
||
May 26, 2009, 04:57 |
|
#38 |
New Member
|
Hello Mark,
your way solved my problem. Thank you for your help. Best regards |
|
August 1, 2009, 16:49 |
|
#39 |
Senior Member
Tomislav Maric
Join Date: Mar 2009
Location: Darmstadt, Germany
Posts: 284
Blog Entries: 5
Rep Power: 21 |
Hello everyone, I'm glad I've found this thread, since I have the same problem holzmichel has ran into. searching on the net I've found the instructions that I should comment out the if command that returns from the bash if it's ran in non interactive mode.
actually, I'm running OpenFOAM from SLAX live DVD and I'm trying to figure out how to use this live DVD for simulations on a LAN. the folowwing code suggested by olesen Code:
ssh $HOST 'echo $PATH; echo; echo $LD_LIBRARY_PATH' Thank you in advance, Tomislav |
|
August 2, 2009, 17:06 |
|
#40 | |
Senior Member
Tomislav Maric
Join Date: Mar 2009
Location: Darmstadt, Germany
Posts: 284
Blog Entries: 5
Rep Power: 21 |
Quote:
my command goes like this: /path/name/of/mpirun -np 2 -H mario foamExex -v 1.5-dev interFoam -parallel and it works fine on my dual core laptop. What am I doing wrong? |
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problems with Fedora 9 and OpenMPI | harly | OpenFOAM Running, Solving & CFD | 11 | May 3, 2009 05:18 |
[snappyHexMesh] SnappyHexMesh in parallel openmpi | wikstrom | OpenFOAM Meshing & Mesh Conversion | 7 | November 24, 2008 10:52 |
Problems using local openmpi | stephan | OpenFOAM Installation | 1 | December 5, 2007 19:01 |
OpenMPI performance | vega | OpenFOAM Running, Solving & CFD | 13 | November 27, 2007 02:28 |
OpenFOAM 14 with OpenMPI 12 | fhy | OpenFOAM Installation | 0 | July 12, 2007 19:12 |