|
[Sponsors] |
Unforeseen problems in scaling up a cluster built with desktop parts? |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
May 10, 2011, 17:06 |
Unforeseen problems in scaling up a cluster built with desktop parts?
|
#1 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
We currently have a four node cluster built with i7 980X chips and cheap motherboards, RAM and power supplies. This has run fine for us for about six months, and now I am planning on scaling it up with more of the 980X's or with the newer Sandy Bridge i7's. I'm also planning on picking up some used Infiniband equipment.
Is this something anyone else has done at a scale larger than a few nodes? I am looking to add about ~60-80 CPU which could be done for as cheap as $6,000-$7,000 USD using desktop parts. Picking up similar Xeon machines from HP or Dell would be north of $25,000. I expect the Xeons would scale up better if for no other reason than you can put two CPUs in one motherboard, reducing the traffic between nodes. On the other hand, I could buy 3x as many of the desktop machines and still come out saving money. Thoughts? Are Infiniband cards happy in desktop motherboards? Will not having ECC RAM randomly blow up my simulations (it hasn't so far). Is there something else I am missing? |
|
May 10, 2011, 20:41 |
|
#2 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
definitaly an interesting topic, desktop parts are much less expansive... even if you lose the brandwith of server mb and cpu, it still might be well worth.
i cannot give any advice, thought i plan to make a systme like your in the near future. the only problem i can foresee (given my narrow experience) is the interconnect part, but you already planned to buy infiniband. (how much are you paying for a IB card, if you don't mind sharing?) what system are you using for the logistic and housing? i think not normal pc cases. i've found http://www.server8.it/index.php this company on the web and it's products look interesting (at least for the european market, hardware in the us is much cheaper :-( ), but other options or opinions would be welcome. |
|
May 11, 2011, 23:51 |
|
#3 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
sail,
I was planning on just throwing them in some cheap 2U cases. Space isn't really an issue and I will still be able to fit it all in one rack anyway. I am not sure what the Infiniband equipment is going to cost me. I assume something like $2500 for all of it. The more I research this the less sense the Xeon machines make. Sandy Bridge desktop chips have the highest memory bandwidth per core available, and you can use 2133mhz memory to increase it even more (here is a CFD benchmark for different memory speeds: http://techreport.com/articles.x/20377/2). You cannot use any memory faster than 1333mhz with Xeons. Supposedly you can easily overclock them to ~4ghz as well. I understand why larger companies might go with the enterprise class stuff but when hardware is such a significant portion of your budget then desktop parts make much more sense. I'll follow up if I have any issues. |
|
May 12, 2011, 03:11 |
|
#4 | |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Quote:
if you really get IB setup for so cheap please let me know, here in Europe they want 200-300 gbp for a card and approx 3000£ for the smallest switch . it would be cheaper for me to have an holiday in the us and do the shopping there. about the unforeseen problems, i might recommend asking on the bewoulf mailing list. it is a list which deal about bewoulf-type cluster (cluster of cheap commodity hardware) and you might have some useful insight or recommendations about putting IB cards on desktop mb. Last edited by sail; May 12, 2011 at 03:46. |
||
May 12, 2011, 11:31 |
|
#5 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
and while is not strictly cfd, but i suppose massively parallelized routines, chek out what they did in the Viacom Labs, http://www.anandtech.com/show/4332/v...-gets-bigger/2
|
|
May 19, 2011, 15:10 |
|
#6 | |
New Member
Bob Yin
Join Date: May 2011
Posts: 1
Rep Power: 0 |
very interesting project... the Sandy bridge i7 outperforms most server cpus...
However, i have a question regarding the file i/o, especially when you scale the system up to 60~80 cpu. how you handle the massive files? to build a file i/o node will be quite expensive. you don't want to login into each individual node to copy your file. you have any solutions on this? Bob Quote:
|
||
May 19, 2011, 18:41 |
|
#7 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
Bob,
You definitely do not need to copy files around to each node. Any HPC cluster, whether built from commodity parts or purchased as a system, should have a networked filesystem that all nodes have access to. My file storage node has 4 2TB drives in RAID 5. That whole setup only cost like $1000. Unless you are doing something funky like saving the entire flow field history at a high time resolution, then your simulation speed is not going to be limited by filesystem I/O. |
|
May 26, 2011, 22:09 |
|
#8 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
hi there. how is the shopping/planning going?
just happened to stumble upon a discussion about ecc vs non-ecc ram, might be worth a look. http://www.beowulf.org/archive/2011-May/028799.html best regards. |
|
May 27, 2011, 19:00 |
|
#9 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
sail,
Interesting discussion. They seem pretty split on whether or not ECC is necessary. I think CFD might have an advantage in that a flipped is extremely unlikely to give you a wrong answer... it could only crash the simulation or throw a nonsense value into a cell that would be corrected in the next iteration. I have ordered 12 i7 2600k with MSI P67 motherboards and 8gb of 2133mhz for each node. I am going to try and get the thing running on ethernet before I go hunting for Infiniband equipment. Scaling should be pretty good up to 4 nodes on gigabit. I'll let you know how it goes. I am hoping to have it up and running by the third week in June. |
|
May 30, 2011, 07:53 |
|
#10 |
Senior Member
Attesz
Join Date: Mar 2009
Location: Munich
Posts: 368
Rep Power: 17 |
Hi there,
it's a really interesting topic. I would build a small cluster my starting small CFD company. To keep the prices down, I would prefer AMD CPU's, namely AMD Phenom2 X6 1100T processors. Do you have any experience using this stuff? I would by a middle category motherboard or maybe server ones (with 2 procs), and I would like to use DDR3 rams at least 1333MHz or more and 16GB's each node. At the first time I would have only 2 nodes, so they will communicate using Gigabit ethernet lan. I would use an other "node" which is a weaker PC (dualcore, 8gb ram) which one would be the file server containing some 2TB HDD's in RAID. I would use OpenFOAM on them. What do you think? Regards, Attila |
|
June 30, 2011, 11:55 |
|
#11 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Hello there. how is the work going? is your cluster already up and running? any impressions you would like to share?
|
|
July 1, 2011, 07:06 |
|
#12 |
Senior Member
Attesz
Join Date: Mar 2009
Location: Munich
Posts: 368
Rep Power: 17 |
I've read about clusters and processors a little bit more, and I found that the Sandy Bridge processors are much better than Phenoms. I would buy them, but now this cluster-building is not actual for me
|
|
July 1, 2011, 15:26 |
|
#13 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
I've got 15 cases, power supplies and processors sitting in an empty office. The rest of the parts should be here in a few days.
|
|
July 6, 2011, 11:59 |
|
#14 |
New Member
Join Date: Jul 2011
Posts: 1
Rep Power: 0 |
kyle
I am playing now with Phenom ii X6, but would like to know more about your cluster configuration and prices can you give us contact to your suplier please ? My suplier 60USD per AMD phenom ii X6 3.2Ghz Last edited by AlgoTrader; July 10, 2011 at 08:10. |
|
August 16, 2011, 02:43 |
|
#15 |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Hi Kyle.
just wondering how it is working your new cluster... |
|
August 16, 2011, 11:41 |
|
#16 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
sail,
I got it up and running a couple weeks ago. We went with 15 i7 2600k's with 8gb of RAM each. I bought some old Infiniband equipment off of Ebay for very cheap, and it all worked fine. The scaling is almost perfectly linear on our simulations all the way up to 60 CPU. I am definitely happy we did it this way. All together we spent just under $10,000 USD. If we had bought an HPC system or a bunch of servers from Dell or HP, it would have been $30,000 minimum. |
|
August 16, 2011, 23:20 |
|
#17 | |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Quote:
just out of curiosity: you decided to go with IB after running some cases/benchmarks over gigalan or you just bite the bullet and grabbed IB from the start? |
||
August 16, 2011, 23:38 |
|
#18 |
Senior Member
Join Date: Mar 2009
Location: Austin, TX
Posts: 160
Rep Power: 18 |
We already had a four node cluster with gig-e, and were seeing diminshing returns on the speedup. It was never a thought to go all the way up to 15 nodes without a faster interconnect.
|
|
August 17, 2011, 04:48 |
|
#19 | |
Senior Member
Vieri Abolaffio
Join Date: Jul 2010
Location: Always on the move.
Posts: 308
Rep Power: 17 |
Quote:
kudos! |
||
November 8, 2011, 13:08 |
|
#20 |
New Member
don
Join Date: Nov 2011
Posts: 1
Rep Power: 0 |
Have you ran a HPL benchmark on the system. Is there a measure of the Power consumption. I am very interested in the performance and power efficiency of your setup.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Linux Cluster Setup Problems | Bob | CFX | 1 | October 3, 2002 19:08 |