|
[Sponsors] |
Two computer cluster - problems with settings |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
September 3, 2016, 21:16 |
Two computer cluster - problems with settings
|
#1 |
Member
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 11 |
Hello.
I believe I read every post that concerns creating small cluster on this forum, but again I failed to set my two computers into cluster. In my case server hostname is darko, and client hostname is dradenkovic. Both computers have same username - dradenkovic. I used NFS. SSH didn't have problems. I defined /etc/hosts with adresses on both computers. Settings on server: -------------------------------------------------------------------------------------------------- /etc/fstab: /home/dradenkovic/OpenFOAM /export/OpenFOAM none bind 0 0 #(adress of Openfoam installation) ----------------------------------------------------------------------------------------------------- /etc/exports: /export/OpenFOAM <client adress>(rw,nohide,insecure,no_subtree_check,async) ---------------------------------------------------------------------------------------------------- Settings on client computer: --------------------------------------------------------------- /etc/fstab: darko:/export/OpenFOAM /home/dradenkovic/ nfs 0 0 #(Here I tried various settings, non of them worked) --------------------------------------------------------------- Sometimes it looks like there is some conflict, so I can't even log into my client computer. From everything that I read, I believe that I need to create identical file structure on both computers. I have to check this. If on server, location of OpenFOAM is /home/dradenkovic/OpenFOAM, is it necessary that on client computer location of OpenFOAM is the same as on the server (i.e. is it the route on client /home/dradenkovic/OpenFOAM, in my case)? In that case, what should I try in above settings? Is there any possibility to mount OpenFOAM directory into some other folder, for example /mnt? I am afraid, because whole day, whenever I tried to run example case, I was obtaining error: HTML Code:
-------------------------------------------------------------------------- mpirun was unable to find the specified executable file, and therefore did not launch the job. This error was first reported for process rank 12; it may have occurred for other processes as well. NOTE: A common cause for this error is misspelling a mpirun command line parameter option (remember that mpirun interprets the first unrecognized command line token as the executable). Node: dradenkovic Executable: /home/dradenkovic/OpenFOAM/OpenFOAM-dev/platforms/linux64GccDPInt32Opt/bin/pisoFoam -------------------------------------------------------------------------- 12 total processes failed to start Here is decomposeParDict HTML Code:
/*--------------------------------*- C++ -*----------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: dev | | \\ / A nd | Web: www.OpenFOAM.org | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ FoamFile { version 2.0; format ascii; class dictionary; location "system"; object decomposeParDict; } // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // numberOfSubdomains 24; method simple; simpleCoeffs { n (24 1 1); delta 0.001; } hierarchicalCoeffs { n (1 1 1); delta 0.001; order xyz; } manualCoeffs { dataFile ””; } distributed no; roots ( ); Best regards, Darko |
|
November 9, 2017, 04:35 |
|
#2 |
Member
|
Hello Darko,
I met similar problem like you. I tried to run the case icoFoam by two nodes, debian1(16 cpu) and debian2 (16 cpu) respectively. Finally I got the errors as: tanjianyu@debian1:~/OpenFOAM/tanjianyu-v1706/run/tutorials/incompressible/icoFoam/cavity/cavity$ tail -f log rank 16; it may have occurred for other processes as well. NOTE: A common cause for this error is misspelling a mpirun command line parameter option (remember that mpirun interprets the first unrecognized command line token as the executable). Node: debian2 Executable: /home/tanjianyu/OpenFOAM/OpenFOAM-v1706/platforms/linux64GccDPInt32Opt/bin/icoFoam -------------------------------------------------------------------------- 16 total processes failed to start Have you solved it? Could you give me some advice? Best regards, Chengan |
|
November 9, 2017, 06:19 |
|
#3 |
Member
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 11 |
Hello Chengan,
I solved the problem, but I didn't use the elegant way. If I miss something, do not mind me, it's been a while since I solved this. 1. I did not use NFS. I gave up on that. I had the two identical systems (same files with same routes), with the same user names. 2. From some reason, in .bashrc instead of the end, put source $HOME/OpenFOAM/OpenFOAM-dev/etc/bashrc at the beginning of the file, on both computers. 3. You can define IP addresses in /etc/hosts, with the first one 127.0.0.1 (I am not sure for part with bold letters; check somewhere else; if you find, correct me in other post) 4. Define number of cores in file machines 5. run mpirun ... Again, not elegant, but it works. Good luck, Darko |
|
November 10, 2017, 05:59 |
|
#4 |
Member
|
Dear Darko,
Thank you very much for your help. Actually,I had the two identical systems (same files with same routes), with the same user names like you. I try to use your method but it dosen't work on my two computers. If there is problem with openfoam for some version? There was no problem with mpi and nfs. Thank you again. Best regards, Chengan |
|
November 10, 2017, 08:21 |
|
#5 |
Member
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 11 |
Chengan,
I use this principle and I it works. I don't believe that the error depends on the version of OpenFoam but if we don't succeed, you can try some different OpenFoam version. Try to run some other tutorial case in parallel. For example, channel395. Can you copy files from one computer to another without password? Routes, OpenFoam files, case files, user names, source OpenFoam on both computers in the first line - absolutely everything is identical? Can you show your etc/hosts and machines files? Command that you use to start job? Complete error log? Regards, Darko |
|
November 11, 2017, 03:13 |
|
#6 |
Member
|
Dear Darko
I can copy files from one computer to another without password. Absolutely everything is identical on the two nodes. The etc/hosts is : Code:
127.0.0.1 localhost 10.246.251.4 ubuntu1 10.246.251.5 ubuntu2 # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters Code:
#!/bin/sh cd ${0%/*} || exit 1 # Run from this directory # Source tutorial run functions . $WM_PROJECT_DIR/bin/tools/RunFunctions runApplication blockMesh runApplication decomposePar mpirun --hostfile machines -np 32 $(getApplication) -parallel > log & runApplication reconstructPar Code:
/*---------------------------------------------------------------------------*\ | ========= | | | \\ / F ield | OpenFOAM: The Open Source CFD Toolbox | | \\ / O peration | Version: v1706 | | \\ / A nd | Web: www.OpenFOAM.com | | \\/ M anipulation | | \*---------------------------------------------------------------------------*/ Build : v1706 Arch : "LSB;label=32;scalar=64" Exec : icoFoam -parallel Date : Nov 11 2017 Time : 08:34:42 Host : "ubuntu1" PID : 4808 Case : /home/tanjianyu/OpenFOAM/tanjianyu-v1706/run/tutorials/incompressible/icoFoam/cavity/cavity nProcs : 32 Slaves : 31 ( "ubuntu1.4809" "ubuntu1.4810" "ubuntu1.4811" "ubuntu1.4812" "ubuntu1.4813" "ubuntu1.4814" "ubuntu1.4815" "ubuntu1.4816" "ubuntu1.4817" "ubuntu1.4818" "ubuntu1.4819" "ubuntu1.4821" "ubuntu1.4830" "ubuntu1.4831" "ubuntu1.4836" "ubuntu2.22649" "ubuntu2.22650" "ubuntu2.22651" "ubuntu2.22652" "ubuntu2.22653" "ubuntu2.22654" "ubuntu2.22655" "ubuntu2.22656" "ubuntu2.22657" "ubuntu2.22658" "ubuntu2.22659" "ubuntu2.22660" "ubuntu2.22661" "ubuntu2.22662" "ubuntu2.22663" "ubuntu2.22664" ) Pstream initialized with: floatTransfer : 0 nProcsSimpleSum : 0 commsType : nonBlocking polling iterations : 0 sigFpe : Enabling floating point exception trapping (FOAM_SIGFPE). fileModificationChecking : Monitoring run-time modified files using timeStampMaster (fileModificationSkew 10) allowSystemOperations : Allowing user-supplied system call operations // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * // Create time Create mesh for time = 0 -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 18 in communicator MPI_COMM_WORLD with errorcode 1. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- Thank you very much. Chengan |
|
November 11, 2017, 09:41 |
|
#7 |
Member
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 11 |
Chengan,
Can you try step by step? On one computer: in your cavity case, run blockMesh, then run decomposePar. Then copy that whole case to the other computer, to the same route as in the first computer. Find on which computer you need to source RunFunctions (I haven't done that, I don't know). Case has to be identical (and decomposed) on both computers. Then run mpirun command mpirun -machines -np 32 icoFoam -parallel Regards, Darko |
|
November 12, 2017, 02:41 |
|
#8 |
Member
|
Dear Darko
Thank you very much for your help. Finally I succeed!! I have tried your method step and step. Finally I got 16 results in ubuntu1 and the other 16 results in ubuntu2. I should copy them together and run the command reconstructPar. So I try to use nfs again and put Code:
source $HOME/OpenFOAM/OpenFOAM-dev/etc/bashrc I still have a question. If we could write the command Code:
mpirun --hostfile machines -np 32 $(getApplication) -parallel > log & Code:
runParallel $(getApplication) Thank you again! Best regards Chengan |
|
November 12, 2017, 06:23 |
|
#9 |
Member
Darko Radenkovic
Join Date: Oct 2015
Posts: 38
Rep Power: 11 |
Dear Changan,
I am glad that you succeeded. I don't know answer to your question, but you can experiment now. Regards, Darko |
|
November 13, 2017, 05:16 |
|
#10 |
Member
Ricky
Join Date: Jul 2014
Location: Germany
Posts: 78
Rep Power: 12 |
Hallo Chengan,
If I understood your question correctly then there is a way of doing that. in your .bashrc add: Code:
source $HOME/OpenFOAM/OpenFOAM-dev/etc/bashrc source $WM_PROJECT_DIR/bin/tools/RunFunctions or if you want to have your own self defined function then you can start with this one and keep modifying it according to your needs in .bashrc: Code:
runPar() { mpirun --hostfile "$1" -np "$2" "$3" -parallel > "log.$3";} Code:
runPar machines 32 simpleFoam Regards, Ricky #Note: The $WM_PROJECT_DIR is only recognized after sourcing OpenFOAM.
__________________
If it is easy, then something is fishy! Last edited by kera; November 13, 2017 at 08:12. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Improper data to cluster through .cas and .dat files | kaeran | FLUENT | 0 | October 24, 2014 05:10 |
Why not install cluster by connecting workstations together for CFD application? | Anna Tian | Hardware | 5 | July 18, 2014 15:32 |
Running OpenFoam on a Computer Cluster in the Cloud - cloudnumbers.com | Markus Schmidberger | OpenFOAM Announcements from Other Sources | 0 | July 26, 2011 09:18 |
Problem of cluster | aerodynamics | FLUENT | 4 | July 11, 2011 09:53 |
Computer Cluster for Turbomachinery | sam | FLUENT | 3 | September 6, 2007 15:58 |