Basics: Set-up for different desktop machines

SD@TUB · January 4, 2010, 15:18

Hello Folks,

i'm setting up an OpenFOAM installation on 3 desktop-pcs, which are connected
via lan. After searching the forum for consistent information about correct
set-up for parallel running, i'm a little confused due to my problem!

The system have to work for 2 seperate users, say Usr1 and Usr2. So my idea
was:

1st PC: head-node as server
2nd PC: node for Usr1 as client
3nd PC: node for Usr2 as client

All machines are Core2Duo with 12GB RAM at all. I installed GNU/Linux Debian 64-bit
and OF-1.6 64-bit on all nodes.
The server ist permanent running but the clients just when each user works on it.
(OF works fine on every single node.)

Now i'm facing the problem, that the clients have different directories of
course:

/home/Usr1
/home/Usr2

Now when i try to run a case in parallel, say on client Usr1 using the server via mpirun,
OF is asking for login data Usr1@server. After entering the data i get an error message:

Code:

bash: orted: command not found ...

OpenMPI has some problems with the settings of $LD_LIBRARY_PATH!
(I've tried to fix the problem with 2 accounts on the server for Usr1 and Usr2 including
the OF-Installation on both accounts, but it didn't change anything)

I've read in some threads, that on all nodes there have to be identical directories to
make sure OF works properly in combination with NFS/NIS. I need some details on that!

Has anybody helpful suggestions to manage my requested set-up?
(I can't imagine that there is only the way of using the same user account on every node)

thx,
Stefan

olesen · January 5, 2010, 06:24

It's always a bit difficult with remote diagnoses but I'll give it a go.

Quote:

Originally Posted by SD@TUB

The system have to work for 2 seperate users, say Usr1 and Usr2. So my idea
was:

1st PC: head-node as server
2nd PC: node for Usr1 as client
3nd PC: node for Usr2 as client

...

Now i'm facing the problem, that the clients have different directories of
course:

/home/Usr1
/home/Usr2

This isn't really a problem in itself, but you will need to have the calculation data available between the nodes. The simplest is to use NFS for that.

Quote:

Originally Posted by SD@TUB

OF is asking for login data Usr1@server. After entering the data i get an error message:

Code:

bash: orted: command not found ...

OpenMPI has some problems with the settings of $LD_LIBRARY_PATH!
(I've tried to fix the problem with 2 accounts on the server for Usr1 and Usr2 including
the OF-Installation on both accounts, but it didn't change anything)

Regardless of having local user accounts or NIS accounts, you need to be able to login between all the machines (including the head node) without entering extra information. One (IMO good) solution is password-less ssh.
Generate your keys on the user machine (ssh-keygen) and distribute the public part of the keys to the 'authorized_keys'.

Once you have this working, you can address the next part - namely getting your settings to work when logging in remotely and with a non-interactive remote shell. The simplest is to add the OpenFOAM (and thus the openmpi) bits into the respective ~/.bashrc files. Note that ~/.profile would be the wrong place, since this won't get seen by an non-interactive shell.
Another possibility is to forward the information about which files to source. This is what they do in the mpirunDebug and in the wmakeScheduler, but I would avoid that approach for now since it entails even more scripting.

To check that everything is getting found, you can simply test that the path, ld_library_path, etc are getting set properly when accessing remotely.
Eg,

Code:

ssh $HOST 'echo $PATH'   # for this machine
ssh otherHost 'echo $PATH'   # for another machine

NB you must have single-quotes to avoid shell expansion.

After this gets working okay, you can refocus on getting OpenFOAM working in parallel.
For this, you either need NFS mounts between the machines (easy) or need to transport the decomposed geometry to the remote machines and set the decompose roots (in decomposePar) accordingly (hard/annoying).

For your setup, the simplest would be to create an NFS export directory on each machine. For example,
"/data/export" where the user would place their calculation data. And then use an auto-mounter (eg, autofs) to access these data remotely as /net/MACH/data/export.

This should work without any major issues.

/mark

SD@TUB · January 6, 2010, 12:56

Thanks a lot /mark for the helpful remarks! Some questions appear subsequently...
(password-less ssh is done by the way)

First:
Is it needful to have OF installed in each user account Usr1 and Usr2 on the server? (The accounts are still needed to access from the seperate clients, right!?) I'm asking for processing afterwards any updates, configurations, adding users etc. accordingly to OF. Therefore it would be effective of having a centralized installation directory on the server.

Second:
What do you mean with "OpenFOAM (and thus the openmpi) bits" to source "the respective ~/.bashrc files"? (Do I have to add the path to the OF-Installation on the server e.g. in the ~./bashrc files on the clients? An example would be helpful!)

Third:
I'm not sure of understanding the NFS-mounts correctly?

Here my Set-up so far:
Server:
/home/Usr1/OpenFOAM/Usr1-1.6/run...
/home/Usr2/OpenFOAM/Usr2-1.6/run...
Client Usr1:
/home/Usr1/OpenFOAM/Usr1-1.6/run...
Client Usr2:
/home/Usr2/OpenFOAM/Usr2-1.6/run...

Do I have to export the Server-directories and automount them on each client or vice versa? (Why is that needed in cases where no distributed data are set in the 'decomposParDict' on the client where the calculation will running via mpirun)

Thx,
/Stefan

olesen · January 7, 2010, 06:59

Quote:

Originally Posted by SD@TUB

Is it needful to have OF installed in each user account Usr1 and Usr2 on the server? (The accounts are still needed to access from the seperate clients, right!?) I'm asking for processing afterwards any updates, configurations, adding users etc. accordingly to OF. Therefore it would be effective of having a centralized installation directory on the server.

I also prefer using a central installation. Like the example in the README:
FOAM_INST_DIR=/data/app/OpenFOAM
and then you have the following structure
/data/app/OpenFOAM/OpenFOAM-1.6.x/
/data/app/OpenFOAM/ThirdParty-1.6.x/

The 'site' directory provides an ideal place for sharing custom settings, solvers, utilities and libraries.
For versioned files:

/data/app/OpenFOAM/site/1.6.x/bin
/data/app/OpenFOAM/site/1.6.x/lib
/data/app/OpenFOAM/site/1.6.x/constant
/data/app/OpenFOAM/site/1.6.x/system

And for unversioned files:
/data/app/OpenFOAM/site/constant
/data/app/OpenFOAM/site/system

Quote:

Originally Posted by SD@TUB

What do you mean with "OpenFOAM (and thus the openmpi) bits" to source "the respective ~/.bashrc files"? (Do I have to add the path to the OF-Installation on the server e.g. in the ~./bashrc files on the clients? An example would be helpful!)

You have several options here. The simplest is to follow the suggestion from the README and place code like this in your ~/.bashrc:

export FOAM_INST_DIR=/data/app/OpenFOAM
foamDotFile=$FOAM_INST_DIR/OpenFOAM-1.6.x/etc/bashrc
[ -f $foamDotFile ] && . $foamDotFile

Of course, if you have a supported queuing system (we use GridEngine), the mpirun from openmpi takes care of doing the right thing and will be able to find itself on the various clients without needing to touch the ~/.bashrc.

Quote:

Originally Posted by SD@TUB

I'm not sure of understanding the NFS-mounts correctly?

Here my Set-up so far:
Server:
/home/Usr1/OpenFOAM/Usr1-1.6/run...
/home/Usr2/OpenFOAM/Usr2-1.6/run...
Client Usr1:
/home/Usr1/OpenFOAM/Usr1-1.6/run...
Client Usr2:
/home/Usr2/OpenFOAM/Usr2-1.6/run...

Do I have to export the Server-directories and automount them on each client or vice versa? (Why is that needed in cases where no distributed data are set in the 'decomposParDict' on the client where the calculation will running via mpirun)

There are two independent aspects that you need address.

Ensure that a consistent OpenFOAM installation can be found on each node.
Ensure that the user's calculation data can be found on each node.

You can have a consistent OpenFOAM installation either be using a central installation or just by using consistent local paths and a consistent OpenFOAM version on each node. How you solve this is really immaterial.

Ensuring that the user's calculation data can be found on each node entails NFS. How you resolve it depends on how you wish to work.

I generally don't bother with the ~/OpenFOAM/User-Version/run structure that OpenFOAM provides by default. But you get it to work provided that the ~/OpenFOAM/User-Version/run directories are non-local (ie, NFS-shared). One solution could be this:

ln -s /data/shared/Usr1 /home/Usr1/OpenFOAM/Usr1-1.6/run

where the /data/shared/Usr1 is NFS exported/imported from somewhere (ie, head node, workstation, NAS, whatever).

Our workflow means that we instead prefer to use a structure that looks more like this:
/data/work/User/Customer/ProjectNr/catia/
/data/work/User/Customer/ProjectNr/hypermesh/
/data/work/User/Customer/ProjectNr/Component-Revision/OpenFOAM/
/data/work/User/Customer/ProjectNr/Component-Revision/starcd/

To track which application and OpenFOAM version was uses and where, we have a sentinel file 'FOAM_SETTINGS' in each OpenFOAM/ directory with this sort of contents:
APPLICATION=rhoPorousSimpleFoam
FOAM_VERSION=OpenFOAM-1.6.x

This helps prevent accidental restarts with mismatched versions.

I hope this helps you further.
BTW: does @TUB mean TU Berlin or somewhere else?

SD@TUB · January 8, 2010, 04:09

Ok, things getting more clearly...
I'll try to get OF running on all nodes and let the forum know!

BTW: TU Berlin is truly correct! Was it a guess or do you have
connections to TUB?

olesen · January 8, 2010, 06:14

Quote:

Originally Posted by SD@TUB

BTW: TU Berlin is truly correct! Was it a guess or do you have
connections to TUB?

Just a lucky guess (TU == Technische Universitaet), but 'B' might have been Braunschweig as well.

SD@TUB · January 11, 2010, 09:59

Hello again,

after doing all the things above, the mpirun command exited with the following message (calculation started on client Usr1 in connection with Server):

Code:

/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  1.6                                   |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 1.6-f802ff2d6c5a
Exec   : simpleFoam -parallel
Date   : Jan 11 2010
Time   : 13:34:49
Host   : head-node
PID    : 3753
[0] 
[0] 
[0] Cannot read "/home/Usr1/system/decomposeParDict"
[0] 
FOAM parallel run exiting
[0] 
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 3638 on
node head-node exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

I guess the head-node (Server) can't find the decomposeParDict!? But why is he searching in
/home/Usr1/system/decomposeParDict ?
The default structure normally has to be
/home/Usr1/OpenFOAM/Usr1-1.6/run...case123/system/decomposeParDict
duo to my set-up (see above)?

Another aspect i'm facing is that i do have passwordless-ssh from the clients to the server. Thus i'm able to mount exported directories from the server on clients. The other way round doesn't work. But for a ssh connection from the server to the client, i need an account on the clients for the server!? But this isn't prefered!
Can I prevent such dependences if i'm working with the suggested /data/app/OpenFOAM structure on every node and avoid the server/client configuartion? What about NIS-accounts? I've never worked with NIS yet...

/Stefan

SD@TUB · January 20, 2010, 17:49

So, after another two days working on the three nodes, everthing works regular. I did the recommended changes by Mark. Now on all nodes i installed OF in /data/app (as root). So no 'real' server/client configuration is present, but that doesn't matter for my usage so far.
In addition to that every node has NFS export/import directories. That is needed to ensure the involved nodes find the decompose case data (i.e. processor0, processor1, ..., processorN) on the machine the calculation is started.
To keep my things working properly some confusing annotations! When i set the option distributed no in the decomposePar dict ('cause i do not want to allocate my data) the remote nodes expect their part of data on a local drive (in an equivalent directory structure by default). To evade this i use the distribute option and set the roots to the respective NFS export/import directories. Be careful with the right node order in the <machinefile>, too. The host wherefrom starting the mpirun has to be the first in the list. According to the specified CPUs one become a master, the following treated as slaves. When you run the decompose case via PyFoam the needed <machinefile> for e.g. pyFoamPlotRunner.py use a different order of the slave CPUs (I don't figure out why PyFoam not use the <machinefile> standard openMPI suppose to do).
Finally i have to remark that for further installation purposes it would be particularly advantageous of using a server with NIS accounts. But I'm not sure if OF has to be installed on every node in the cluster when using openMPI. Maybe someone can clarify this subject!?

/Stefan

PS: Thanks Mark for the opportunity calling you! I'll come back to you when necessary.

gschaider · January 21, 2010, 08:06

Quote:

Originally Posted by SD@TUB

Be careful with the right node order in the <machinefile>, too. The host wherefrom starting the mpirun has to be the first in the list. According to the specified CPUs one become a master, the following treated as slaves. When you run the decompose case via PyFoam the needed <machinefile> for e.g. pyFoamPlotRunner.py use a different order of the slave CPUs (I don't figure out why PyFoam not use the <machinefile> standard openMPI suppose to do).

I do not quite understand this: if a machinefile is specified pyFoam uses that. At least for me

To find out how mpirun is actually called set in ~/.pyFoam/pyfoamrc

Code:

[Debug]
ParallelExecution: True

You should then get some additional output on stdout and into ~/.pyFoam/log/general (amongst them the actual call to mpirun). You can add additional parameters as described in http://openfoamwiki.net/index.php/Co...call_to_mpirun

Quote:

Originally Posted by SD@TUB

Finally i have to remark that for further installation purposes it would be particularly advantageous of using a server with NIS accounts.

Why go for NIS if you can have LDAP?

Bernhard

olesen · January 21, 2010, 09:05

Quote:

Originally Posted by SD@TUB

In addition to that every node has NFS export/import directories. That is needed to ensure the involved nodes find the decompose case data (i.e. processor0, processor1, ..., processorN) on the machine the calculation is started.
To keep my things working properly some confusing annotations! When i set the option distributed no in the decomposePar dict ('cause i do not want to allocate my data) the remote nodes expect their part of data on a local drive (in an equivalent directory structure by default). To evade this i use the distribute option and set the roots to the respective NFS export/import directories.

Since the directories are all accessible via NFS, you shouldn't need to be doing so much work. One possibility that really must work without very little effort:

1. Export the directories from each machine. The resulting /etc/exports would resemble this

Code:

/export/data *.mydomain.name(fsid=0,rw,root_squash,sync,no_subtree_check)

This is presumably what you already have.

2. Laziness is virtue. Be lazy and do not import these directories on each machine. This will drive you crazy, is a maintenance nightmare, scales very poorly and can bind more network resources than you wish.

3. Clean out all those NFS mount from the various /etc/fstab files. Only include mounts that absolutely must exist for the system to run (eg, if you are NFS mounting the user home directories). If you can't umount some of the NFS mounts, try "lsof" to see which processes might still have access to them (or be brutal and just reboot the machines).

4. Use the automounter. The /etc/auto.net does a pretty good job of it. Activate it in /etc/auto.master, insert the service and start the autofs service as needed.

The NFS-exported directories (step 1 above) should now be addressable via the automounter.
For example, the "/net/machine1/export/data/" mount should correspond to the "/export/data" NFS share from machine1. Note that a simple "ls /net" will only show directories that are already mounted. You'll need to use "ls /net/machine1" the first time.

If you set things up like this, you will now be able to address the same disk space in two different ways. On all of the machines, you can address this disk space as "/net/machine1/export/data". On machine1, you have the choice of using the real location ("/export/data") or the automounted NFS-share ("/net/machine1/export/data"). Fortunately the kernel is smart enough to notice that the NFS share from machine1 -> machine1 is actually local. You have no performance penalty on the local machine if you address it as "/net/machine1/export/data" instead of "/export/data".

If that is clear, then you can see that the only slight difficulty is to make sure that you are indeed on the "/net/machine1/export/data" path instead of the "/export/data" path when starting your jobs. This is not terribly difficult to overcome with something trivial like this.

Code:

    [ "${PWD#/export/data/}" != "$PWD" ] && cd "/net/$HOST$PWD"

Quote:

Originally Posted by SD@TUB

Be careful with the right node order in the <machinefile>, too.

If you have another runtime (we use GridEngine for example), you don't need to bother with the hostfiles or even the -np option for mpirun. This is one of the really nice aspects of openmpi.

Quote:

Originally Posted by SD@TUB

Finally i have to remark that for further installation purposes it would be particularly advantageous of using a server with NIS accounts. But I'm not sure if OF has to be installed on every node in the cluster when using

Using NIS only solves the username and passwd issue. All the other pieces (get the NFS shares and automounter working nicely) remains the same. I never had any luck get the NIS automount to do anything useful though.

Likewise, OpenFOAM does not need to be installed on every node, but it must be accessible from every node. A NFS share is the most convenient, but you could also compile and install it on one machine and use rsync to distribute it about if that works better for you.

SD@TUB · January 21, 2010, 16:22

@Bernhard
I just noticed using the <machinefile> in the way openMPI wants it, e.g.

Code:

head-node cpu=X
client-node1 cpu=Y
...

the PyFoamPlotRunner.py exceed with an error like "to much entries in one line"! So I omit the "cpu=" entries and the ...Runner starts without any problems. In addition to that i realized that the order of the involved CPUs differs to the event started via mpirun itself.
Therfore i get a descending order:

Code:

master:
CPU_X.1 (head-node)
slaves:
CPU_X.2 (head-node)
...
CPU_X.N (head-node)
CPU_Y.1 (client1-node)
CPU_Y.2 (client1-node)
...
CPU_Y.N (client1-node)

Using the mod <machineFile> according PyFoam (see above) i get an alternate order:

Code:

master:
CPU_X.1 (head-node)
slaves:
CPU_Y.1 (client1-node)
CPU_X.2 (head-node)
CPU_Y.2 (client1-node)
...
CPU_X.N (head-node)
CPU_Y.N (client1-node)

This confilcts with my chronological roots in the decomposeParDict in comparison of starting the calculation via mpirun without PyFoam.
Maybe the difference in the chronology isn't caused by PyFoam!?

@Mark
Thanks once again! This i my first little cluster, so still in practise... I'm pleased the things getting work and the second user, too. If there is spare time soon i keep me busy with these stuff. The next questions appear for sure!

/Stefan

gschaider · January 22, 2010, 08:11

Quote:

Originally Posted by SD@TUB

@Bernhard
I just noticed using the <machinefile> in the way openMPI wants it, e.g.

Code:

head-node cpu=X
client-node1 cpu=Y
...

the PyFoamPlotRunner.py exceed with an error like "to much entries in one line"! So I omit the "cpu=" entries and the ...Runner starts without any problems. In addition to that i realized that the order of the involved CPUs differs to the event started via mpirun itself.

The problem is that PyFoam parses the machine-file to find out whether the number of machines there matches the decomposition. When I did it (it's one of the oldest parts) I went for the easiest solution then: one cpu per line, if a node has more than one cpu it appears in two (or more) lines. I never fixed this because that is the way the SGE on our cluster generates the machinefiles. But I might consider fixing it if a bug-report materialzes itself at http://sourceforge.net/apps/mantisbt...e_status_id=-2

Quote:

Originally Posted by SD@TUB

Therfore i get a descending order:

Code:

master:
CPU_X.1 (head-node)
slaves:
CPU_X.2 (head-node)
...
CPU_X.N (head-node)
CPU_Y.1 (client1-node)
CPU_Y.2 (client1-node)
...
CPU_Y.N (client1-node)

Using the mod <machineFile> according PyFoam (see above) i get an alternate order:

Code:

master:
CPU_X.1 (head-node)
slaves:
CPU_Y.1 (client1-node)
CPU_X.2 (head-node)
CPU_Y.2 (client1-node)
...
CPU_X.N (head-node)
CPU_Y.N (client1-node)

This confilcts with my chronological roots in the decomposeParDict in comparison of starting the calculation via mpirun without PyFoam.
Maybe the difference in the chronology isn't caused by PyFoam!?

Stupid question: "master:"/"slave:" is part of the machinefile? AFAIK PyFoam does not touch the machine-file

Bernhard

olesen · January 22, 2010, 08:32

Quote:

Originally Posted by SD@TUB

...
This confilcts with my chronological roots in the decomposeParDict in comparison of starting the calculation via mpirun without PyFoam.
Maybe the difference in the chronology isn't caused by PyFoam!?

@Mark
Thanks once again! This i my first little cluster, so still in practise... I'm pleased the things getting work and the second user, too. If there is spare time soon i keep me busy with these stuff. The next questions appear for sure!

I think that in this case spending some time to get the automounter and NFS import/export working will save you a lot of time. You could then, for example, simply decompose your geometry and run your job without worrying about decomposition roots.

SD@TUB · January 22, 2010, 17:59

@Bernard
I forgot to say that the CPU order is an output produced by OpenFOAM:

Code:

/*---------------------------------------------------------------------------*\
| =========                 |                                                 |
| \\      /  F ield         | OpenFOAM: The Open Source CFD Toolbox           |
|  \\    /   O peration     | Version:  1.6                                   |
|   \\  /    A nd           | Web:      www.OpenFOAM.org                      |
|    \\/     M anipulation  |                                                 |
\*---------------------------------------------------------------------------*/
Build  : 1.6-f802ff2d6c5a
Exec   : /data/app/OpenFOAM/OpenFOAM-1.6/applications/bin/linux64GccDPOpt/pisoFoam -parallel
Date   : Jan 22 2010
Time   : 18:24:47
Host   : client1
PID    : 6330
Case   : /net/host/data/
nProcs : 6
Slaves :
5
(
client1.6332
head-node.5024
head-node.5025
client2.4933
client2.4934
)

Pstream initialized with:
floatTransfer     : 0
nProcsSimpleSum   : 0
commsType         : nonBlocking
SigFpe : Enabling floating point exception trapping (FOAM_SIGFPE).

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

I follow your notes and use a machineFile for PyFoam like:

Code:

head-node
head-node
client
client

instead of

Code:

head-node cpu=2
client cpu=2

and everything works! I wouldn't rate this a bug, may i simple conversion could avoid conflicts.

@Mark
I agree with you but for this config i run the calcs from the same roots. So the settings remains unchanged!

gschaider · January 27, 2010, 12:44

Quote:

Originally Posted by SD@TUB

@Bernard

Code:

head-node
head-node
client
client

instead of

Code:

head-node cpu=2
client cpu=2

and everything works! I wouldn't rate this a bug, may i simple conversion could avoid conflicts.

I'll rewrite the error message to be more meaningful.

January 4, 2010, 15:18	Basics: Set-up for different desktop machines	#1
SD@TUB Member Stefan Join Date: Jan 2010 Location: Kiel, Germany Posts: 81 Rep Power: 16	Hello Folks, i'm setting up an OpenFOAM installation on 3 desktop-pcs, which are connected via lan. After searching the forum for consistent information about correct set-up for parallel running, i'm a little confused due to my problem! The system have to work for 2 seperate users, say Usr1 and Usr2. So my idea was: 1st PC: head-node as server 2nd PC: node for Usr1 as client 3nd PC: node for Usr2 as client All machines are Core2Duo with 12GB RAM at all. I installed GNU/Linux Debian 64-bit and OF-1.6 64-bit on all nodes. The server ist permanent running but the clients just when each user works on it. (OF works fine on every single node.) Now i'm facing the problem, that the clients have different directories of course: /home/Usr1 /home/Usr2 Now when i try to run a case in parallel, say on client Usr1 using the server via mpirun, OF is asking for login data Usr1@server. After entering the data i get an error message: Code: bash: orted: command not found ... OpenMPI has some problems with the settings of $LD_LIBRARY_PATH! (I've tried to fix the problem with 2 accounts on the server for Usr1 and Usr2 including the OF-Installation on both accounts, but it didn't change anything) I've read in some threads, that on all nodes there have to be identical directories to make sure OF works properly in combination with NFS/NIS. I need some details on that! Has anybody helpful suggestions to manage my requested set-up? (I can't imagine that there is only the way of using the same user account on every node) thx, Stefan

January 21, 2010, 16:22		#11
SD@TUB Member Stefan Join Date: Jan 2010 Location: Kiel, Germany Posts: 81 Rep Power: 16	@Bernhard I just noticed using the <machinefile> in the way openMPI wants it, e.g. Code: head-node cpu=X client-node1 cpu=Y ... the PyFoamPlotRunner.py exceed with an error like "to much entries in one line"! So I omit the "cpu=" entries and the ...Runner starts without any problems. In addition to that i realized that the order of the involved CPUs differs to the event started via mpirun itself. Therfore i get a descending order: Code: master: CPU_X.1 (head-node) slaves: CPU_X.2 (head-node) ... CPU_X.N (head-node) CPU_Y.1 (client1-node) CPU_Y.2 (client1-node) ... CPU_Y.N (client1-node) Using the mod <machineFile> according PyFoam (see above) i get an alternate order: Code: master: CPU_X.1 (head-node) slaves: CPU_Y.1 (client1-node) CPU_X.2 (head-node) CPU_Y.2 (client1-node) ... CPU_X.N (head-node) CPU_Y.N (client1-node) This confilcts with my chronological roots in the decomposeParDict in comparison of starting the calculation via mpirun without PyFoam. Maybe the difference in the chronology isn't caused by PyFoam!? @Mark Thanks once again! This i my first little cluster, so still in practise... I'm pleased the things getting work and the second user, too. If there is spare time soon i keep me busy with these stuff. The next questions appear for sure! /Stefan

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
OF 1.6 \| Ubuntu 9.10 (64bit) \| GLIBCXX_3.4.11 not found	piprus	OpenFOAM Installation	22	February 25, 2010 14:43
OpenFOAM on MinGW crosscompiler hosted on Linux	allenzhao	OpenFOAM Installation	127	January 30, 2009 20:08
Help with GNUPlot	Renato.	Main CFD Forum	6	June 6, 2007 20:51
Env variable not set	gruber2	OpenFOAM Installation	5	December 30, 2005 05:27
How to set environment variables	kanishka	OpenFOAM Installation	1	September 4, 2005 11:15

January 6, 2010, 12:56		#3
SD@TUB Member Stefan Join Date: Jan 2010 Location: Kiel, Germany Posts: 81 Rep Power: 16	Thanks a lot /mark for the helpful remarks! Some questions appear subsequently... (password-less ssh is done by the way) First: Is it needful to have OF installed in each user account Usr1 and Usr2 on the server? (The accounts are still needed to access from the seperate clients, right!?) I'm asking for processing afterwards any updates, configurations, adding users etc. accordingly to OF. Therefore it would be effective of having a centralized installation directory on the server. Second: What do you mean with "OpenFOAM (and thus the openmpi) bits" to source "the respective ~/.bashrc files"? (Do I have to add the path to the OF-Installation on the server e.g. in the ~./bashrc files on the clients? An example would be helpful!) Third: I'm not sure of understanding the NFS-mounts correctly? Here my Set-up so far: Server: /home/Usr1/OpenFOAM/Usr1-1.6/run... /home/Usr2/OpenFOAM/Usr2-1.6/run... Client Usr1: /home/Usr1/OpenFOAM/Usr1-1.6/run... Client Usr2: /home/Usr2/OpenFOAM/Usr2-1.6/run... Do I have to export the Server-directories and automount them on each client or vice versa? (Why is that needed in cases where no distributed data are set in the 'decomposParDict' on the client where the calculation will running via mpirun) Thx, /Stefan

January 8, 2010, 04:09		#5
SD@TUB Member Stefan Join Date: Jan 2010 Location: Kiel, Germany Posts: 81 Rep Power: 16	Ok, things getting more clearly... I'll try to get OF running on all nodes and let the forum know! BTW: TU Berlin is truly correct! Was it a guess or do you have connections to TUB?

January 20, 2010, 17:49		#8
SD@TUB Member Stefan Join Date: Jan 2010 Location: Kiel, Germany Posts: 81 Rep Power: 16	So, after another two days working on the three nodes, everthing works regular. I did the recommended changes by Mark. Now on all nodes i installed OF in /data/app (as root). So no 'real' server/client configuration is present, but that doesn't matter for my usage so far. In addition to that every node has NFS export/import directories. That is needed to ensure the involved nodes find the decompose case data (i.e. processor0, processor1, ..., processorN) on the machine the calculation is started. To keep my things working properly some confusing annotations! When i set the option distributed no in the decomposePar dict ('cause i do not want to allocate my data) the remote nodes expect their part of data on a local drive (in an equivalent directory structure by default). To evade this i use the distribute option and set the roots to the respective NFS export/import directories. Be careful with the right node order in the <machinefile>, too. The host wherefrom starting the mpirun has to be the first in the list. According to the specified CPUs one become a master, the following treated as slaves. When you run the decompose case via PyFoam the needed <machinefile> for e.g. pyFoamPlotRunner.py use a different order of the slave CPUs (I don't figure out why PyFoam not use the <machinefile> standard openMPI suppose to do). Finally i have to remark that for further installation purposes it would be particularly advantageous of using a server with NIS accounts. But I'm not sure if OF has to be installed on every node in the cluster when using openMPI. Maybe someone can clarify this subject!? /Stefan PS: Thanks Mark for the opportunity calling you! I'll come back to you when necessary.