|
[Sponsors] |
Probles with 2 node cluster with Mandrake 10.0 |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
January 24, 2006, 05:28 |
Probles with 2 node cluster with Mandrake 10.0
|
#1 |
Guest
Posts: n/a
|
Hi there all, we are trying to get up and running a wee 2 node cluster using Mandrake 10.0 as OS.
So far we are stucked with this error: NP: Spawning STAR processes on multiple nodes (cluster). bash: line 1: /home/starhome/test/entalpia/entalpia_0001/.starboot.run: No such file or directory p0_5413: p4_error: Timeout in making connection to remote process on administracion2: 0 /usr/starcd/MPICH/1.2.4/linux_2.4-gcc_2.95.3-glibc_2.2.2-dso/ch_p4/bin/mpirun: line 1: 5413 Broken pipe /home/starhome/test/entalpia/entalpia_0001/.starboot.run -p4pg .starboot.mpi -p4wd /home/starhome/teSinNombre 1st/entalpia/entalpia_0001 PNP: Shutdown [2006-01-23-17:35:42] Execution aborted by request (SIGABRT) after 309 seconds (TOTAL ELAPSED TIME). What is that .starboot.run file starCD is looking for? Have any of you had the same problem? How did you solve it? Any help and hints are welcome!! Thanks a lot, Esti. |
|
January 24, 2006, 09:01 |
Re: Probles with 2 node cluster with Mandrake 10.0
|
#2 |
Guest
Posts: n/a
|
Make sure that you have write permission in every directory you are using. Make sure that both nodes can see the master directory and that it is called by the same name on both nodes. Sometimes an nfs mounted partition has one name on its local machine and another on a remote.
|
|
January 24, 2006, 11:02 |
Re: Probles with 2 node cluster with Mandrake 10.0
|
#3 |
Guest
Posts: n/a
|
Hi there Steve, thanks very much for your quick reply.
Our cluster is composed by 2 CPUs, one of them called Master and the other one Slave, same name on local and remote. Both machines can communicate perfectly, since the rsh-server is installed and works fine, I can send any command from one machine to the other. We also have a precompiled MPICH, provided by CDAdapco. The machines work fine in sequential, but when it comes to parallel computations, the same error appears once and again and again. Could it be because of the computer arquitecture or type or something like that? The same problem appears with these two nodes under Windows... Any other suggestion? Thanks a lot... |
|
January 28, 2006, 10:09 |
Re: Probles with 2 node cluster with Mandrake 10.0
|
#4 |
Guest
Posts: n/a
|
Esti,
I would echo Steve's comments. It appears to me that the directory /home/starhome/test/entalpia/ on slave is not nfs-mounted to be the same directory on master. Have you tried the -distribute flag? Using this flag with Star-CD v3.24 or later (where the nfs-mount requirement is not as stringent) may help. |
|
February 1, 2006, 12:48 |
Re: Probles with 2 node cluster with Mandrake 10.0
|
#5 |
Guest
Posts: n/a
|
Hi Mike, Thanks very much for your hint. Is the first time I face installing Linux and all that stuff so I think I should start from basics and learn a bit more about it... I've no idea what NFS is, so imagine....
Thanks a lot you all, I'll let you know how it goes!!! Cheers, Esti |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
how can i list node temperatur by node location | Ali Doustdar | FLUENT | 2 | July 13, 2007 00:48 |
CFD code for Mac Xserve Intel cluster node? | Chris Cameron | Main CFD Forum | 0 | October 17, 2006 15:14 |
Cluster run without master node | Mikhail | CFX | 4 | September 29, 2005 09:58 |
Linux Mandrake v9.1 | allan | Siemens | 2 | October 6, 2003 15:56 |
CFX 5.6 on Mandrake 9.1 -success | matej forman | CFX | 1 | August 15, 2003 20:40 |