linux cluster

June 20, 2003, 10:43

Hi,

we are planning to set up a linux cluster a my work. Right now we are in the process of setting up a proposal for this and we thought it could be helpful with some expert opinions.

About 5 people will use the cluser and various types of programs will be run on it. Typically each run takes about a week on a single processor machine. With a cluster we would run longer jobs, though.

What type of hardware would you recommend (which bottle necks should be avoided)? Number of processors? Network requirements? Software?

Thanks in advance, Johan

June 20, 2003, 13:32

Linux is very stable (I am currently running on it). Any standard Linux distribution will be fine, but avoid to use Red Hat 9.0 (not so stable). Personally, I prefer SuSE v8.0, but SuSE 8.1 will be fine, too. If you want to set up a queuning system, OpenPBS could do it. It allows you also to check the load before starting a computation. Also, you may need NFS (Network File System) and NIS (Network Information System), depending on how you want to distribute the data stored and on how you want to grant access to the cluster. Do NOT use twin-processor machines. Typically, the bus memory bandwith is not sufficiently high. That means that 2 serial calculations (or a 2 processes parallel calculation) will not run as fast as a calculation with two separate single processors . Concerning the network, if you have more than 16 processors in parallel, youshould think of taking at least Myrinet for the global communication. 100Mbs Ethernet would do fine with anything below than 8~16 processors. Please do not forget to consider the backup. Data generated by running jobs on clusters may become quite high. So do not spare the money on this side ;-).

June 23, 2003, 02:00

http://www.fluent.com/software/fluen...rt/hpclust.pdf

I found the above article quite informative and I think the performance issues are similar with STAR.

It seems to me that clusters are cost-effective (ie. good performance-to-price ratio) if you can't afford anything better than a 4-CPU Xeon machine (obviously RISC chips are out of the question).

For small or mid-sized transient problems with very long running times (ie. many, many, time steps), a SMP system might be better because the latency of normal ethernet gets in the way, and Myrinet/dolphin is not worth it for just a few machines.

June 24, 2003, 21:06

The main problem that you will have with Zeon Linux clusters is that since they are 32bit machines (actually they are pseudo 36bit but lets not go into this) hence the 2Gb memory limit per executable process. Meaning that if you have jobs over 2 million cells they will have to be split over 2 processors. Whether this is done over a dual processor machine or two single processor machines is another matter.

So if you go for single cpu machines you won't need more than 3Gb of memory per machine. About 0.35Gb will be taken by the os and another chunk around 0.25 will be taken by the way the pci bus operates. That will leave over 2Gb for jobs and thats the limit per cpu. Note that with dual machines you will also loose about 0.6Gb, hence you are left with about 1.7Gb per cpu unless if you use the new intel E705 chipset which supports over 6Gb of memory (the pseudo 36bit Zeon bus does this) and moves the pci and os memory on the memory region above the 4Gb limit (pretty much what dos 6 did in the old days by moving part of the os in the 640 - 1024Mb memory region) which case you will be left with more than 2Gb per cpu which you will not be able to really use. But at least you'll have per cpu as much usable memory as for single cpu machines.

Note however that prostar/proam(some proam functions can use more cpu's) can only be used in single cpu mode. Hence try to mesh a 3 million cell case with proam, then you may require a 4-5 million custom mesh and that will be impossible on a 4Gb Zeon dual cpu box.

Regarding the memory bandwidth, i find that the latest dual Zeon machines have enough bandwidth. An single Itanium 1GHz cpu with 3Mb cache will finish 1.8 million cells runs at about the same time it takes when they are run on prohpc on a dual zeon 2.2GHz machine. If the jobs are smaller than 0.8Gb then the 1:2 ration becomes about 1:1.6 which means that the memory bandwidth on dual Zeon machines is not too bad. Note however that within this year intel will bring out a new Zeon platform which will use DDR400 memory instead of DDR266 which is what the intel e7505 chipset employs. Such a machine will increase Zeon based workstation memory bandwidth by at least 50%.

Obviously a cluster of 4-5 dual machines will perform much better using a gigabit ethernet than the 100K ethernet.

June 27, 2003, 14:45

I am afraid you have written quite a number of incorrect statements.

1)With certain limitations (enough memory, kernel compiled correctly, no single malloc or common block larger than 2Gb), linux executables can grab close to 4Gb (2^32) not 2Gb (2^31). We have run prostar using as much as 3.5Gb on a 32bit linux system.

2)Even if you have a single cpu machine, most of 4GB can be used by a single process so you may want that much memory. It also allows you to run 2 simultaneous 2Gb processes without swapping. You may need to do that some time. To say all you might ever need is 3Gb for one processor is just not true.

3)It is quite possible to build a 6-8 million cell model on a 32bit linux box with 4Gb of memory on it. I know for a fact it has been done.

4)It may be that a 1Ghz Itanium is as fast as 2x2.2Ghz pentiums, but Pentium Xeons now come in 2.8Ghz and I think even 3.0 (not sure). So why compare an Itanium to a Pentium cpu that is more than 1 year out of date and 30-40% slower than what you can buy today?

July 1, 2003, 19:05

4) I am comparin an Itanium 1GHz with a Pentium Xeon (400) 2.2 GHz. Which is the same cpu as a Xeon 2.8 (400) GHz in every respect apart from clock speed. The newer Pentium Xeon (533) are just a bit faster than their (400) equivalents. As far as i know there will not be a Xeon heigher than 3.4GHz by the end of this year and the new Itaniums 1.3,1.4 and 1.5GH will be out this year. The Itanium : Zeon performance ratio i mentioned seems to remain almost the same even if new and upcomming cpus come out during the duration of this year are taken into account.

3)Prostar requires much less than a 1cell - 1byte memory capacity so with 2Gb of memory one should be able to generate a 3-3.5 million mesh. So your exaple of "We have run prostar using as much as 3.5Gb on a 32bit linux system." isn't surprising. But i doubt if one can run this 3-3.5 million cell mesh on a single cpu.

2)"It also allows you to run 2 simultaneous 2Gb processes without swapping." If a machine has 4Gb of memory and you can run a 4 million cells StarCd executable which requires around 4Gb of memory swapping will not be avoided. As far as i know Linux like any other os consumes / requires part of the memory. So about 0.35Gb can not be used for used applications. Even if one strips the kernel down to bare essentials the os will take up some of the memory which can not be used for applications.

1) As far as i recall even StarCD in their last user meeting at London acknowledged the 2Gb executable problem on 32bit Linux. And i haven't heard anything new to get around this on 32bit machines. I would be interested to know what are the required kernel settings and how you implement no single malloc and no common block larger than 2Gb in conjuction with compiling and running StarCd.

July 4, 2003, 13:02

The problem with the 2Gb memory limitation is *not* due to the OS, since you can recompile your kernel fro accessing up to 3.5 GB of memory. The limitation was due to the use of the ABSOFT compiler (up to v7.5) that ould not make use of the 64bits pointers in order to overcome the memory limitation of 2GB when using 32bits pointers. This problem should be resolved now from ABSOFT v8.0...

June 20, 2003, 10:43	linux cluster	#1
Johan Carlsson Guest Posts: n/a	Hi, we are planning to set up a linux cluster a my work. Right now we are in the process of setting up a proposal for this and we thought it could be helpful with some expert opinions. About 5 people will use the cluser and various types of programs will be run on it. Typically each run takes about a week on a single processor machine. With a cluster we would run longer jobs, though. What type of hardware would you recommend (which bottle necks should be avoided)? Number of processors? Network requirements? Software? Thanks in advance, Johan

June 20, 2003, 13:32	Re: linux cluster	#2
4xF Guest Posts: n/a	Linux is very stable (I am currently running on it). Any standard Linux distribution will be fine, but avoid to use Red Hat 9.0 (not so stable). Personally, I prefer SuSE v8.0, but SuSE 8.1 will be fine, too. If you want to set up a queuning system, OpenPBS could do it. It allows you also to check the load before starting a computation. Also, you may need NFS (Network File System) and NIS (Network Information System), depending on how you want to distribute the data stored and on how you want to grant access to the cluster. Do NOT use twin-processor machines. Typically, the bus memory bandwith is not sufficiently high. That means that 2 serial calculations (or a 2 processes parallel calculation) will not run as fast as a calculation with two separate single processors . Concerning the network, if you have more than 16 processors in parallel, youshould think of taking at least Myrinet for the global communication. 100Mbs Ethernet would do fine with anything below than 8~16 processors. Please do not forget to consider the backup. Data generated by running jobs on clusters may become quite high. So do not spare the money on this side ;-).

June 23, 2003, 02:00	Re: linux cluster	#3
cjtune Guest Posts: n/a	http://www.fluent.com/software/fluen...rt/hpclust.pdf I found the above article quite informative and I think the performance issues are similar with STAR. It seems to me that clusters are cost-effective (ie. good performance-to-price ratio) if you can't afford anything better than a 4-CPU Xeon machine (obviously RISC chips are out of the question). For small or mid-sized transient problems with very long running times (ie. many, many, time steps), a SMP system might be better because the latency of normal ethernet gets in the way, and Myrinet/dolphin is not worth it for just a few machines.

June 24, 2003, 21:06	Re: linux cluster	#4
skipio Guest Posts: n/a	The main problem that you will have with Zeon Linux clusters is that since they are 32bit machines (actually they are pseudo 36bit but lets not go into this) hence the 2Gb memory limit per executable process. Meaning that if you have jobs over 2 million cells they will have to be split over 2 processors. Whether this is done over a dual processor machine or two single processor machines is another matter. So if you go for single cpu machines you won't need more than 3Gb of memory per machine. About 0.35Gb will be taken by the os and another chunk around 0.25 will be taken by the way the pci bus operates. That will leave over 2Gb for jobs and thats the limit per cpu. Note that with dual machines you will also loose about 0.6Gb, hence you are left with about 1.7Gb per cpu unless if you use the new intel E705 chipset which supports over 6Gb of memory (the pseudo 36bit Zeon bus does this) and moves the pci and os memory on the memory region above the 4Gb limit (pretty much what dos 6 did in the old days by moving part of the os in the 640 - 1024Mb memory region) which case you will be left with more than 2Gb per cpu which you will not be able to really use. But at least you'll have per cpu as much usable memory as for single cpu machines. Note however that prostar/proam(some proam functions can use more cpu's) can only be used in single cpu mode. Hence try to mesh a 3 million cell case with proam, then you may require a 4-5 million custom mesh and that will be impossible on a 4Gb Zeon dual cpu box. Regarding the memory bandwidth, i find that the latest dual Zeon machines have enough bandwidth. An single Itanium 1GHz cpu with 3Mb cache will finish 1.8 million cells runs at about the same time it takes when they are run on prohpc on a dual zeon 2.2GHz machine. If the jobs are smaller than 0.8Gb then the 1:2 ration becomes about 1:1.6 which means that the memory bandwidth on dual Zeon machines is not too bad. Note however that within this year intel will bring out a new Zeon platform which will use DDR400 memory instead of DDR266 which is what the intel e7505 chipset employs. Such a machine will increase Zeon based workstation memory bandwidth by at least 50%. Obviously a cluster of 4-5 dual machines will perform much better using a gigabit ethernet than the 100K ethernet.

June 27, 2003, 14:45	Re: linux cluster	#5
steve Guest Posts: n/a	I am afraid you have written quite a number of incorrect statements. 1)With certain limitations (enough memory, kernel compiled correctly, no single malloc or common block larger than 2Gb), linux executables can grab close to 4Gb (2^32) not 2Gb (2^31). We have run prostar using as much as 3.5Gb on a 32bit linux system. 2)Even if you have a single cpu machine, most of 4GB can be used by a single process so you may want that much memory. It also allows you to run 2 simultaneous 2Gb processes without swapping. You may need to do that some time. To say all you might ever need is 3Gb for one processor is just not true. 3)It is quite possible to build a 6-8 million cell model on a 32bit linux box with 4Gb of memory on it. I know for a fact it has been done. 4)It may be that a 1Ghz Itanium is as fast as 2x2.2Ghz pentiums, but Pentium Xeons now come in 2.8Ghz and I think even 3.0 (not sure). So why compare an Itanium to a Pentium cpu that is more than 1 year out of date and 30-40% slower than what you can buy today?

July 1, 2003, 19:05	Re: linux cluster	#6
skipio Guest Posts: n/a	4) I am comparin an Itanium 1GHz with a Pentium Xeon (400) 2.2 GHz. Which is the same cpu as a Xeon 2.8 (400) GHz in every respect apart from clock speed. The newer Pentium Xeon (533) are just a bit faster than their (400) equivalents. As far as i know there will not be a Xeon heigher than 3.4GHz by the end of this year and the new Itaniums 1.3,1.4 and 1.5GH will be out this year. The Itanium : Zeon performance ratio i mentioned seems to remain almost the same even if new and upcomming cpus come out during the duration of this year are taken into account. 3)Prostar requires much less than a 1cell - 1byte memory capacity so with 2Gb of memory one should be able to generate a 3-3.5 million mesh. So your exaple of "We have run prostar using as much as 3.5Gb on a 32bit linux system." isn't surprising. But i doubt if one can run this 3-3.5 million cell mesh on a single cpu. 2)"It also allows you to run 2 simultaneous 2Gb processes without swapping." If a machine has 4Gb of memory and you can run a 4 million cells StarCd executable which requires around 4Gb of memory swapping will not be avoided. As far as i know Linux like any other os consumes / requires part of the memory. So about 0.35Gb can not be used for used applications. Even if one strips the kernel down to bare essentials the os will take up some of the memory which can not be used for applications. 1) As far as i recall even StarCD in their last user meeting at London acknowledged the 2Gb executable problem on 32bit Linux. And i haven't heard anything new to get around this on 32bit machines. I would be interested to know what are the required kernel settings and how you implement no single malloc and no common block larger than 2Gb in conjuction with compiling and running StarCd.

July 4, 2003, 13:02	Re: linux cluster	#7
4xF Guest Posts: n/a	The problem with the 2Gb memory limitation is not due to the OS, since you can recompile your kernel fro accessing up to 3.5 GB of memory. The limitation was due to the use of the ABSOFT compiler (up to v7.5) that ould not make use of the 64bits pointers in order to overcome the memory limitation of 2GB when using 32bits pointers. This problem should be resolved now from ABSOFT v8.0...

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
how to use -ssh w/ Fluent on Linux cluster	Bill	FLUENT	4	April 2, 2014 01:04
HPC on a Linux cluster	Jihwan	Siemens	2	November 22, 2005 11:17
cluster on Linux	Ivan	Siemens	2	December 3, 2004 12:03
LINUX CLUSTER	co2	FLUENT	1	April 24, 2004 05:30
linux cluster	Johan Carlsson	Main CFD Forum	2	June 21, 2003 08:14