|
[Sponsors] |
September 13, 2011, 04:39 |
Edge on a Linux cluster
|
#1 |
New Member
Filip Wallberg
Join Date: Oct 2010
Posts: 23
Rep Power: 16 |
Dear Edge users.
I am planing to use Edge to solve a problem with about 100,000,000 cells and I don't think that my average desktop computer will be able to coupe with this. Instead I am planning to put together about 10 desktop computers with 4 cores each, giving me a cluster with 40 processors. Only problem is that I can not find any information on how get Edge to run on a cluster. At the moment I am running Edge on Ubuntu with parallel computing utilizing all 4 cores of the computer without any problems. Do any of you have experience of running Edge on a Linux cluster? What kind of cluster would you use? I have heard about people using the Linux Rocks Cluster with other codes, would this work with Edge as well? As far as I know Edge uses MPI, which is supported by Rocks. Or is there some simpler way of building a cluster in Ubuntu on which I can run Edge? Please advise. Thanks! |
|
September 15, 2011, 02:57 |
|
#2 |
New Member
Filip Wallberg
Join Date: Oct 2010
Posts: 23
Rep Power: 16 |
After some struggle I managed to setup an MPI-cluster using the mpd process manager. However, I cannot get edge to run on it.
For testing purposes I am right now only using two computers 1 server and 1 node (the server has 2 processors and the node has 4). Booting the cluster I used the following command: mpdboot --verbose --ncpus=2 -n 2 Giving the following output: running mpdallexit on AB-SATS LAUNCHED mpd on AB-SATS via RUNNING: mpd on AB-SATS LAUNCHED mpd on 192.168.12.173 via AB-SATS RUNNING: mpd on 192.168.12.173 mpdtrace confirms that the cluster is up and running. I also tried running some of the test applications to confirm that it is actually working, so far so good. As a next step prepared the input and mesh files on the server and then cloned them onto the node. I now try to run a multi process calculation with edge by using the following command: edge_mpi_run test.ainp 6 The error message I get from edge is the following: Initialisation started Give input-file (.ainp) ?test.ainp Reading: "test.ainp" done ERROR IN EDGE, IN ROUTINE "MIMD_SETUP" !!! --- ERROR FROM MIMD_SETUP, NPARTC=/NPART --- ERROR IN EDGE, IN ROUTINE "MIMD_SETUP" !!! --- NPARTC= 1 --- ERROR IN EDGE, IN ROUTINE "MIMD_SETUP" !!! --- NPART = 6 --- DATE - 110915 TIME - 12:45:48 --- EXITING EDGE FROM SUBROUTINE "MIMD_SETUP" --- ************************************************** ************************ * * * Starting Edge * * Edge 5.0.0 www.foi.se/edge * * * * Build time Tue Mar 23 22:30:01 CET 2010 * * Built by enp * * Build system Linux-x86_64 * * Build host mohawk * * Build FC mpif90 * * * ************************************************** ************************ Date - 110915 Time - 12:45:48 Initialisation started Give input-file (.ainp) ?At line 84 of file /extra3/enp/src/edge/5.0.0/solver/basic/callok_m.f90 (unit = 5, file = 'stdin') Fortran runtime error: End of file I also tried to run the same job locally by first shutting down the cluster, but end up with the same error message. Does anyone have any suggestions on what I am doing wrong? Do I need to use another process manager? |
|
September 21, 2011, 00:55 |
|
#3 |
New Member
Filip Wallberg
Join Date: Oct 2010
Posts: 23
Rep Power: 16 |
I am happy to inform that I managed to get the cluster running using OpenMPI and it seems to be working without any problems.
However, when increasing the number of active cores past 12 I do not seem to get a decrease in calculation time. Looking in the resource manager of the different machines I can see all cores of the CPU working at 100 % and sending/receiving data at about 6 Mb/s. All computers have gigabit LAN adapters and I am using a gigabit switch as well. I believe that LAN is not an issue? Does someone with some more experience have any suggestions? Is there some setting I have overlooked? |
|
January 12, 2012, 04:35 |
|
#4 | |
New Member
Alvin
Join Date: Jun 2011
Posts: 7
Rep Power: 15 |
Quote:
I used to run edge on one machine with 8 cores and nearly linear speed up was achieved. However I also encountered your problem of how to get the cluster running. Could you please describe the general steps of running edge on linux clusters? (I am using MPICH2) |
||
January 15, 2012, 12:06 |
|
#5 | |
New Member
Filip Wallberg
Join Date: Oct 2010
Posts: 23
Rep Power: 16 |
Quote:
I created a new user on all computers which i called 'cluster' and gave it the same password on all computers.. After that I installed ssh and set it up so that the user cluster can log on to any other computer in the cluster without having to enter a password. After that I edited the /etc/hosts file, on all the computers, so that it includes the host name and ip address of all computers in the cluster. After that you need a working directory with the same path on all computers, e.g. /home/cluster/edge/. On the server this folder must contain all files created after running the preprocessor. On the nodes it must contain the following files: .ainp, .aboc and all .bedg_p1, 2 ... n files. On the server you have to create a hostfile for openmpi so that it knows how many cores each node has avaliable. The file should be located in your working directory. I called it simply mpi.hosts.. The file should be structured in the following way: server slots=1 node1 slots=4 node2 slots=4 etc If everything is correct you should now be able to run edge on all the computers in your mpi.hosts file. To start the calculation, open a terminal window and cd to the working directory and run the following command mpirun.openmpi -n 9 --hostfile mpi.hosts edge_mpi_run.x When asked give the name of your .ainp file and it should start running. |
||
January 17, 2012, 03:34 |
|
#6 | |
New Member
Alvin
Join Date: Jun 2011
Posts: 7
Rep Power: 15 |
Quote:
|
||
February 24, 2012, 15:21 |
|
#7 | |
New Member
Adam Jirasek
Join Date: Mar 2011
Posts: 18
Rep Power: 15 |
Have you figured out how to run on MPI?
Quote:
|
||
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
how to set periodic boundary conditions | Ganesh | FLUENT | 15 | November 18, 2020 07:09 |
Actuator disk model | audrich | FLUENT | 0 | September 21, 2009 08:06 |
fluent add additional zones for the mesh file | SSL | FLUENT | 2 | January 26, 2008 12:55 |
LINUX cluster and server | azmir | Siemens | 6 | September 17, 2006 20:09 |
Linux Cluster Performance with a bi-processor PC | M. | FLUENT | 1 | April 22, 2005 10:25 |