|
[Sponsors] |
Upgrade from 2x E5-2687W v3 for Comsol 5.3 electromagnetic simulations |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
August 6, 2019, 04:44 |
Upgrade from 2x E5-2687W v3 for Comsol 5.3 electromagnetic simulations
|
#1 |
New Member
Joshua
Join Date: Aug 2019
Posts: 3
Rep Power: 7 |
Hello everyone,
We are looking for an upgrade for our workstation, as it is heavily used for all sorts of different tasks. We are only doing electromagnetic simulations with COMSOL 5.3 and a relative low node count of 10,000 up to 250,000. A typical simulation is using less than 10 GB of RAM. The new Workstation will be used exclusively for these calculations. Our current machine: Dual E5-2687W v3 (each 10 cores, 3.1 GHz Haswell) 192 GB (8x 16GB) of dual rank DDR4 RAM running in quad channel at 2132 MHz We are looking for a performance incensement of about 50%; otherwise, a new machine is not worth the investment. Initially we planned spending about 4000 max. Unfortunately, benchmarks for COMSOL are not very popular and other benchmarks are all over the place. Our first plan was to buy a Threadripper 2990WX and the fastest available RAM for it, but after reading in this forum Im not so sure anymore because of the two cores without a direct memory controller. We would greatly appreciate your opinion on the performance and your suggestions. |
|
August 6, 2019, 12:30 |
|
#2 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,428
Rep Power: 49 |
First things first: You won't get a 50% upgrade over your current machine with 4000$. Maybe not even with 20000$. And avoid TR 2990WX at all costs.
Reading through some of the advice for COMSOL and combining it with your requirements of very low element counts I come to the conclusion that there may not be anything on the market worth buying compared to your current machine. The problems in particular: Scaling with low element counts is bad That's just normal and there is not really a way around this. Parallelization overhead increases as element count decreases. So just buying something with more cores and more memory bandwidth won't help much for strong scaling of small cases. The solver seems to have a significant serial fraction The extent seems to change depending on the exact solver type you use. But in general this means that high core counts don't help much, and even higher memory bandwidth (which the solver definitely likes) does not help much either beyond a certain point. What is needed here is high single-core performance, and the Xeon E5-2687W v3 is quite good in that regard. Amdahl's law at work. You can verify how your small cases scale on your machine by running it on 1,2,4...cores and comparing the execution times. This could definitely help choosing an upgrade path. There would be a simple way around all of this in case your workflow and licenses allow it: Instead of running 1 case on all cores of the machine, run several cases at the same time with lower core count each. The cases will of course run slower due to the memory bottleneck, but this resolves scaling issues and leads to higher overall throughput. So e.g. running 4 cases at the same time will only take 2-3 times as long as running a single case. I assume you already tweaked the solver settings and disabled SMT. If available, activating cluster-on-die mode for your CPUs should also yield some performance gains combined with the right execution flags. There is a discussion about the fastest settings here, specifically with a NUMA machine: https://www.comsol.com/forum/thread/...opteron-system Edit: thinking about this again, maybe there is a chance to get a relatively cheap upgrade. Assuming the following criteria are met: 1) you can not run more than 1 case at the same time due to licensing constraints 2) the case scales better on a single CPU than distributed across both CPUs. 3) overall bad scaling beyond around 8 cores In this case, a core I7-9800x along with 4x16GB of the fastest memory you can afford might perform better. Last edited by flotus1; August 7, 2019 at 04:43. |
|
August 7, 2019, 13:14 |
|
#3 |
Senior Member
Join Date: May 2012
Posts: 552
Rep Power: 16 |
Just to add to this. I tested one of our dual EPYC 7301 machines with Comsol. Just out of curiosity since I do not normally use Comsol.
An 8700k was faster for a CFD case with approximately one million degrees of freedom. I think Comsol suffers a lot from Amdahl's law. We have seen better scaling on Intel CPUs though so that might also be the case. |
|
August 7, 2019, 13:47 |
|
#4 |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,428
Rep Power: 49 |
The way I understood it, COMSOL has two kinds of parallelism implemented. One for shared memory and one for distributed memory systems.
For a workstation with a rather complicated NUMA topology like 2xEpyc, choosing the right one and getting the settings for core binding correct is crucial for performance. I would imagine that just starting it with -np will lead to abysmal performance. They have two pretty in-depth articles about setting up each parallel mode distributed: https://www.comsol.com/support/knowledgebase/1001/ shared: https://www.comsol.com/support/knowledgebase/1096/ |
|
August 8, 2019, 04:22 |
|
#5 |
New Member
Joshua
Join Date: Aug 2019
Posts: 3
Rep Power: 7 |
Thanks for your testing and considerations.
To fill in the missing gaps of information, some more info from my side: -licensing should not be a problem, as we can compute as many cases as we want; as long COMSOL is running on one PC (cluster should be possible as well, not tried yet). From what I read on the forums, however this is only somewhat true for Windows, as with Linux one instance of COMSOL uses one license. - Turning off Hyperthreading did not change the computing speed - We did not try different execution flags up to now, but we'll definitely try - We have experienced the performance scaling roughly with the square root of the number of threads, but this was only tested for “high” core counts (20,15,10….) - we’re using the MUMPS solver, as the COMSOL link suggests is not ideal for high core counts - The current PC is used by other users and programs, so if we can get at least the same performance for let’s say about 2000€ we would buy a new machine too. I will try to get some scaling and flag benchmarks in the meanwhile. Edit: We tried the PARADISO solver and it appears to be faster using all cores (typcal sweep: 8min vs 12min). However testing the expected different scaling of this solver has to wait some time, as the workstation is used by other people as well right now - which then maybe spoils the results. Last edited by fernbedienung; August 8, 2019 at 05:39. |
|
August 16, 2019, 08:02 |
|
#6 | |
New Member
Joshua
Join Date: Aug 2019
Posts: 3
Rep Power: 7 |
Quote:
running the cases of a parameter sweep as a batch sweep massively improved computation time! 1 process, 20 cores: 9:03 h 10 processes 2 cores each 2:40 h 20 processes 1 core each 2:31 h Comsol already is aware of the topology of two CPUs with each 10 cores (also in settings, not only via flags), but as I understand, the additional flags should only affect the execution when using more than 1 core for one calculation? If so I guess this is the best optimization we can get. So thanks for all your great help!!! |
||
August 16, 2019, 08:12 |
|
#7 | |
Super Moderator
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,428
Rep Power: 49 |
Quote:
Anyway, great to hear that you got a 250% performance increase for free. With that out of the way, you could of course upgrade to a faster machine now. With scaling issues resolved, you can now benefit from the hardware improvements of the last 5 years. |
||
Tags |
comsol, electomagnetics, electromagnetic |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Electromagnetic Theory | electromagneticseasy | Main CFD Forum | 0 | June 1, 2012 02:12 |