|
[Sponsors] |
Hardware recommendation for combustion solvers - Forte - Converge |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
February 11, 2020, 23:03 |
Hardware recommendation for combustion solvers - Forte - Converge
|
#1 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
I'm beginning some ICE simulations for my PhD research - and I'm looking for a hardware recommendation.
Right now I am using a $300 acer with win 10, i5-8400, single stick 8gb memory, and some optane thing. Yes I know, a second stick in the other memory channel will perk it up quite a bit, and I have one for it. I have been running Ansys Forte R19.3 Student. I've run a few tutorials, one of which was a single cylinder port injected spark ignition engine. It was around 400k cells, with chemistry only enabled during combustion/expansion, and it took about 30 hrs. Ansys says it should be 20hrs "on a cluster with 16 nodes with dual intel xeon E5-2690 at 2.9ghz (8 total cores)" - I think there is a typo by ansys in there... AFAIK, the student version of Ansys lets me run 16 cores. I have also secured an academic license for Converge 3.0, and will likely be using that to do the actual work. I do not know what the limits on that are. What sort of computing power will I need to get this done? I'll be doing simulations of a 3 cylinder port injected spark ignition engine, based on what I've seen using Forte, it will be about 1.5mil cells. Will my existing desktop get it done? 3-4 days of running is ok, 2 weeks is not ok. I started a forte multi-cylinder tutorial, that is more complicated than my research engine. It looks like it was on track to take about 15 days to complete on my existing desktop, which has scared me into looking at other options. I've been poking around on here, looking through the benchmark thread for faster cheap hardware, but it's hard place my machine in the mix, as I have not ran the benchmark, nor will I have time to for a few days. I've considered a lot of options to speed things up, my budget is about $0, also. 1. Buying another unit like I have, at least then I could run 2 at a time 2. Buying some $100 i5-4590 desktops from craigslist and building Beowulf like this A low-cost Beowulf Cluster (grad student style) 3. Buying an older multi processor server. I've found a few that were interesting. A Dell R910 with 4x E7- 4850, and 32X16GB memory - $500. An R720 with 2x E5-2609v2 and 16gb - $300, and a R420 2x E5-2420v2 16gb for $200. 4. Dept chair suggests trying to get time on a big computer somewhere. IDK if that is possible with Converge or not. I'm trying to find that out. It is certainly the cheapest option, but if I can buy a $200 server and get a run done in a few days with it, I would rather go that route. |
|
February 12, 2020, 11:44 |
|
#2 | ||||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Quote:
Quote:
And since you seem to be limited to 16 cores max, a single workstation is really all you need. Quote:
When you shop for used machines with Xeon E5-2xxx v2 keep a few things in mind: 1) CPUs with numbers E5-24xx v2 only have three memory channels, compared to 4 memory channels on the E5-26xx v2 models. Which is worth the investment. 2) CPUs with numbers E5-260x v2 don't have turbo boost, and overall low frequency. In other terms, they are really slow. Also, only 4 cores on the E5-2609 v2. Overall, maybe don't go lower than two E5-2650 v2. 3) Since these CPUs have 4 memory channels, you need to populate at least 8 identical DIMMs in a dual-socket system to get decent performance. No need to buy it fully decked out from a retailer though, you can buy cheap used memory (DDR3 reg ECC) and install it yourself. Quote:
|
|||||
February 12, 2020, 16:51 |
|
#3 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Thanks for your insight Alex.
When I was digging through the benchmark thread, it appeared to me that on these dual socket machines by the time you get 4 processors on each socket going, they don't get much faster after that, and certainly by 6, it has leveled off. Also I recall seeing that v2 stuff was not terribly faster than the v1 stuff. That sort of excited me about the 4 socket machine. Lots of memory channels, and only run 4,5,6 processors each socket. I suppose that is what also drew me to the 2609v2 and the 2420v2, don't pay for the extra processor that won't speed things up. I think I was sort of thinking about buying 2 of the 2420v2 machines and coupling them with Ethernet. I'd have 24 (even if I only used 16) cores and 12 memory channels for less than $500 vs. If I bought a 2650v2 machine and have 16 cores and 8 memory channels for $400. So in light of all that, how about a few alternate ideas. 1. Local 2 me, pile of 6 hp servers. 2 are E5-2420, others are x55, bunch of ram. $250, plan would be to put a couple together, and hook them together, sell off the rest. Probably not a good idea, more work, slow, ect.. https://www.facebook.com/marketplace...5010249749173/ 2. How about a 2630v3 machine? 16 cores, DDR4 $330, but I'll have to buy DDR4. Is the speed of DDR4 worth the extra cost? https://www.ebay.com/itm/Gigabyte-R1...D/193326850686 3. 2637v2? Needs memory and drives but $300. Looks like really high frequency, but 4 cores and ddr3. Cheap to populate though. https://www.ebay.com/itm/Dell-PowerE...d/324021469917 4. 2650v2 https://www.ebay.com/itm/HP-ProLiant...ry!61330!US!-1 Lastly, say like a 2650v2 vs. a 2420v2. The 26 is going to cost 50% more, is it going to be 50% faster? |
|
February 13, 2020, 12:38 |
|
#4 | |||
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
Quote:
And the benchmark results are not really comparable. Different memory configurations, different kernel versions, different OpenFOAM versions, different people running the benchmarks. With all being equal, similar v2 CPUs should be around 15-20% faster than v1, assuming both use the rated memory frequency. Quote:
Quote:
That being said, all these options are 1U server blades. Filled to the brim with infernally loud 40mm fans. I really recommend buying a workstation platform instead, even if it might be a bit more expensive. |
||||
February 13, 2020, 18:36 |
|
#5 | |||||
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Quote:
Say we had a 2643 and a 2648L both on DDR3-1600, would they have similar performance? Quote:
Quote:
Quote:
Right now on my current desktop (i5-8400 1x8gb 2666mhz) it's going to take 2-3weeks per simulation. I need this to be like 3-4 days, 5-6 tops. Quote:
Last edited by kstuart; February 19, 2020 at 03:01. |
||||||
February 17, 2020, 09:45 |
|
#6 |
New Member
Leo Natan
Join Date: Dec 2019
Posts: 6
Rep Power: 7 |
||
February 19, 2020, 02:58 |
|
#7 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
I have myself semi convinced into buying an R820 with 4x E5-4640v1 and 16 4gb ddr3 1600, and a 1tb hard drive for about $500 shipped. I'm planning on using converge, and I think I will be able to use all 32 cores with it.
Is there anything odd about running a 4 processor machine? Can I install ubuntu on it and run it just like my desktop, just bigger and noisier? If it is just that easy, it really seems like the solution based on cost and some results from the bench marking thread. My other thought is buying 2 Dl160's with 2x E5-2637v2, 8x2gb ddr3 1866,a 1tbhd, and connecting with 1gibt network. It would cost just a bit more, but would probably be the faster option if I am stuck running Forte and 16 cores. I have no idea how to set this up though. From the little bit of information it looks like it might be quite the pain to get setup. |
|
February 20, 2020, 05:29 |
|
#8 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
I guess you could do that. Just make sure you have a graphics card that fits into the R820 and can be supplied with power. Unless you plan to run it as just a compute node.
|
|
February 20, 2020, 15:51 |
|
#9 | |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Quote:
I'm going to get 16 x 4gb pc3 12800, is there an alternate memory configuration I should get to improve things? I have checked into it and with my license for converge I will be able to use all the the cores on this machine, and many more as well. |
||
February 20, 2020, 15:56 |
|
#10 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
16 DIMMs is the best memory configuration for this. 4GB DIMMs will most likely be single-rank, you could gain a few percent of performance by using dual-rank. But it is probably going to cost more.
|
|
February 20, 2020, 15:59 |
|
#11 | |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Quote:
Also comparing to this result OpenFOAM benchmarks on various hardware This setup I'm looking at should perform close to this one, being that it should be handicapped by the 10600 memory? thanks! |
||
February 20, 2020, 16:09 |
|
#12 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
You would have to look at the specifications/exact model of the memory you buy. For example, 1Rx4 indicates single rank, 2Rx4 is dual-rank.
You are getting older CPUs with lower core count, but faster memory. So it is safe to assume that both systems will be pretty close in terms of parallel performance in CFD. |
|
February 21, 2020, 02:25 |
|
#13 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Thanks for the advice Alex. Ended up ordering it finally. $520 shipped, from this techmikeny place. R820, 4x E5-4640, 32 2gb PC312800R, 1tb sata, dvd-rw, 1100w psu, ubuntu server installed.
I was doing some digging trying to figure out what memory I needed to buy to get dual rank, and found some interesting posts. https://web.archive.org/web/20131102...ance-guide.pdf I'm pretty sure I would have needed to get 8 or 16gb to get dual rank, which would have added over $100 to the cost, and I don't need that much. The dell paper shows 2 single rank dimms are almost as good as a single dual rank. At least in the 2 proc machines. Hope it's the same for a 4. Anyway, looking forward to testing it out. Hope it's a "holy cow its fast" not a "why is it so slow" post lol. |
|
February 29, 2020, 03:14 |
|
#14 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Got it running tonight. They ended up sending it with 18 4gb pc312800, just sort of randomly placed. Turns out they are rank 2 sticks so it worked out ok after put them(16) where they needed to be. It's right about where I figured it would be, but I was kinda hoping it would be a little better. I think it's pretty awesome for a $500 setup. I'm so tempted to order another one to hook up to it. My I5-8400 desktop was $350ish with 2x8gb 2666mhz ddr4 is at 342s on 6 cores, and I thought that was a pretty fast machine for the money. This is almost 5x faster!
Plan to run a test case tomorrow with forte, and see how it really does there. # cores Wall time (s): ------------------------ 1 1137.55 2 619.25 4 264.8 6 187.7 8 142.35 10 125.07 12 105.16 14 96.76 16 85.26 18 85.96 20 77.75 22 78.91 24 71.71 26 75.19 28 69.97 32 72.9 |
|
March 30, 2020, 04:49 |
|
#15 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
I've been running a few forte cases on it. 2cyl premixed SI engine 500k cells, over 2 cycles takes about 6 days.
On forte I can only run 16cores. I've noticed it doesn't run 16 full on, it runs all 32 about halfway. Is that bad? Any tips I can try to make this thing faster? I'm running windows server on it for forte. |
|
March 30, 2020, 05:09 |
|
#16 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49 |
That's rather weird.
Which operating system are you using. And how exactly do you check CPU load on individual cores? My first shot in the dark with this kind of issue is core binding or lack thereof, as it is often the case with NUMA systems. I assume you have Hyperthreading turned off in the bios? |
|
March 31, 2020, 04:07 |
|
#17 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
I'm running Windows Server 2019 Standard Evaluation. I'm looking at "Resource Monitor" . I'm pretty sure I turned off hyper-threading, but I've kinda wondered if it really is off.
|
|
April 28, 2020, 00:01 |
weird findings...
|
#18 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
So, since I am running the student version of ansys and am limited to 16/32 cores, I thought I would see what happens if I tried to run 2 jobs at the same time. So I did, and it ran them, but weirdly. It was an Ansys Forte tutorial, and normally it runs in just a tick over 6hrs on my system. So I started it 2 times with different names, about 1min apart. The first one I started (19) finished in about 12hrs. The second one I started (2), finished in about 6 hrs, like normal. Looking at the outputs, it looked like it ran the first submitted about half normal speed. It didn't just queue it until the other job finished. I posted a screen shot of the 2 folders.
Doing more digging, it looks like the first submitted was slow since it didn't have enough memory allocated to it or something. Any ideas what that's all about, and how to improve that? |
|
January 6, 2021, 02:59 |
|
#19 |
New Member
Kurt Stuart
Join Date: Feb 2020
Location: Southern illinois
Posts: 19
Rep Power: 6 |
Bringing this back, I've ditched Ansys Forte, and am working with Converge CFD now, much gooder!
I am super happy with the hardware, and I likely would not have made any progress on my dissertation this last year without it. Anyway, I'm at about 6 days for a simulation, and that's pretty good I think. But the University has a 400 core cluster, and I was supposed to be able to run on that but covid has hampered progress. It's made me think about going faster. Any guesses on performance increase if I were to buy another R820 and connect them through ethernet? I can't remember what network card I got in my current R820, but I'd assume its just a standard slow one. Can I expect close to double the performance? |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
ANSA hardware recommendation | Mohamed Mousa | ANSA | 0 | September 21, 2017 13:26 |
Hardware recommendation? AMD X2, Phenom, Core2Duo, Quadcore? | rparks | OpenFOAM | 0 | April 22, 2009 10:10 |
Hardware Recommendation for Parallel Processing | Brian Bian | CFX | 2 | February 7, 2006 18:27 |