|
[Sponsors] |
April 17, 2006, 15:30 |
Running MPI code on a multiprocessor node
|
#1 |
Guest
Posts: n/a
|
Hi,
I'd appreciate a lot if you can tell me the answer to this question. I'm tring to use MPI (MVAPICH) + OpenPBS to run a code on a 10 node Linux x86 64b system. Now I can run the code with less than or equal to 10 processes in parallel. The qestion is that, each node on the cluster actually has 2 processors, can I somehow manage to request > 10 processes? i.e. I was trying to do the following yet failed: mpirun -np 12 myprogram.exe > myoutput.log Since I was trying to use 12 parallel processes to run the code, can't the computer somehow figure out every node has more than one processor and use them? Thanks, Wen |
|
April 17, 2006, 22:41 |
Re: Running MPI code on a multiprocessor node
|
#2 |
Guest
Posts: n/a
|
hi, As I know,u can use 'mpirun -np m **.exe' with the right setup of machines.Linux and m> ,= or <the number of processors.
|
|
April 18, 2006, 00:35 |
Re: Running MPI code on a multiprocessor node
|
#3 |
Guest
Posts: n/a
|
Yes, you can do it by specifying two processes per node. I think the syntax to do it in OpenPBS is something like:
#PBS â€"l nodes=10pn=2 this command means that you're requesting 10 nodes and 2 processes per node, in other words, 20 processes... Hope it helps Regards Renato. |
|
April 18, 2006, 10:29 |
Re: Running MPI code on a multiprocessor node
|
#4 |
Guest
Posts: n/a
|
I was trying to do what Renato told me, but it didn't work.
I guess there should be some PBS configuration problems, I don't know how to turn that multi-processor option (ppn=2) on. Through Tian_FB's idea, I use a machines.LINUX file that consists of the node names that are assigned to the Job. But each name is repeated (appears twice in the file) and that worked! The code runs with 20 processes (threads). But problem is that when I use qstat or showq (maui) to check the status of the job, they only report one processor being used for each node. Any more hints? Wen |
|
April 18, 2006, 11:37 |
Re: Running MPI code on a multiprocessor node
|
#5 |
Guest
Posts: n/a
|
well,maybe you use the MPI_Get_processor_name to get the processors' names and print out them,you'll see it.I am studying mpi these days and would like to discussion with you via email tfbao@mail.ustc.edu.cn . Email me if you have some good ideals or problems .
Tian |
|
April 18, 2006, 13:23 |
Re: Running MPI code on a multiprocessor node
|
#6 |
Guest
Posts: n/a
|
Ok, try to write your machinefile in the following manner:
NodeName1:2 NodeName2:2 NodeName3:2 ... NodeName10:2 Cheers Renato. |
|
April 18, 2006, 15:56 |
Re: Running MPI code on a multiprocessor node
|
#7 |
Guest
Posts: n/a
|
If I do that, i.e. useing:
node1:2 node2:2 ... node10:2 in the machines.LINUX file for the mpirun, it will say: "getaddrinfo: Temporary failure in name resolution" and stop the job, which means the server can't find the correct nodes, due to wrong node name. Yet, if I do: node1 node1 node2 node2 ... node10 node10 in the machines.LINUX file (this is cheating!), it will find the nodes, and give 20 processes for the run as requested by "mpirun -np 20 ./myprogram.exe". But OpenPBS (qstat) and maui (showq) won't report that correctly. If I say "#PBS -l node=10pn=2" in the PBS script job file, the job will be always on Q status, which means it's been deferred, and it will be deferred forever and never get run. Also, "pbsnodes -a" command can't tell the correct nodes that the code is running on. For example, I can login node0 and node10 and use "top" to find myprogram.exe running onit. Yet "pbsnodes -a" still say those two nodes are free. What next? Wen |
|
April 19, 2006, 17:35 |
Problem Solved
|
#8 |
Guest
Posts: n/a
|
Problem solved, it's due to that I'm not root on the machine, after configuration, the PBS server needs to be restarted.
Basically: 1) pbs_server/nodes file should consist all nodes node1 np=2 node2 np=2 ... 2) in the shell script file for job submission, #PBS -l nodes=10pn=2 mpirun_rsh -rsh -hostfile $PBS_NODEFILE -np 20 ./myprog.exe cheers |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Need a MPI code for learning | gholamghar | Main CFD Forum | 2 | July 19, 2010 17:50 |
Building OpenFOAM on IRIX | lakeat | OpenFOAM Installation | 7 | July 16, 2008 08:27 |
Is Testsuite on the way or not | lakeat | OpenFOAM Installation | 6 | April 28, 2008 12:12 |
Design Integration with CFD? | John C. Chien | Main CFD Forum | 19 | May 17, 2001 16:56 |
What is the Better Way to Do CFD? | John C. Chien | Main CFD Forum | 54 | April 23, 2001 09:10 |