CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > General Forums > Hardware

Epyc 7551 vs 6850K; Ansys Mechanical Bench

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   August 18, 2018, 21:36
Default Epyc 7551 vs 6850K; Ansys Mechanical Bench
  #1
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Case descriptions

https://www.dropbox.com/s/my9cv21ga1...tions.pdf?dl=0

results:


Case V18cg-1
Power Supply Module
Static, linear, thermal
Solver "JCG, real value, symmetric"
5266730 nodes ; 2303613 elements
Memory total in MB-> 37000

Epyc 7551 @ 2,55

Multiply with A Memory Bandwidth : 101.03 GB/s
Multiply with A MFLOP Rate : 10848.33 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 334.27 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 334.49 secs


2x Epyc 7301 @ 2,7



TOTAL PCG SOLVER SOLUTION CP TIME = 367.422 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 367.438 secs


Core-7 6850k @ 4,0:

Multiply with A Memory Bandwidth : 57.44 GB/s
Multiply with A MFLOP Rate : 6167.20 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 567.59 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 568.02 secs


6850K @ GTX 780 6 GB

Multiply with A Memory Bandwidth : 221.38 GB/s
Multiply with A MFLOP Rate : 23770.00 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 232.23 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 232.20 secs


-------------------------------------------------------------

Case V18cg-2
Tractor Rear Axle
Static, linear, structural
Solver "PCG, real-value, symmetric, msave,off"
12329235 nodes ; 2366046 elements
Memory total in MB-> 39000


Epyc 7551 @ 2,55

Multiply with A Memory Bandwidth : 173.46 GB/s
Multiply with A MFLOP Rate : 18625.56 MFlops
Solve With Precond MFLOP Rate : 14404.20 MFlops
Precond Factoring MFLOP Rate : 32734.42 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 182.94 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 183.33 secs


2x Epyc 7301 @ 2,7


Multiply with A Memory Bandwidth : 166.65 GB/s
Multiply with A GFLOP Rate : 17.89 GFlops
Solve With Precond GFLOP Rate : 13.73 GFlops
Precond Factoring GFLOP Rate : 35.39 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 189.09 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 189.42 secs


Core-7 6850k @ 4,0:

Multiply with A Memory Bandwidth : 86.34 GB/s
Multiply with A MFLOP Rate : 9270.32 MFlops
Solve With Precond MFLOP Rate : 8265.42 MFlops
Precond Factoring MFLOP Rate : 45436.20 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 373.34 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 373.88 secs


---------------------------------------

Case V18sp-4
Turbine
Static, nonlinear, structural, 5 cumulative iteration
Solver "sparse, real-value, symmetric"
1063750 nodes ; 726480 elements


in core solution


Epyc 7551 @ 2,55

Memory total in MB-> 63824

CPU Time 1 iteration(sec) = 364.688
ELAPSED Time 1 iteration(sec) = 364.970
ELAPSED Time 2016.406

computational rate (mflops) for solve = 16502.2230 // 16558.9734
effective I/O rate (MB/sec) for solve = 62873.4690 // 63089.6879


2x Epyc 7301 @ 2,7


CPU Time 1 iteration(sec) = 294.906
ELAPSED Time 1 iteration(sec) = 295.166
ELAPSED Time 1600.359

Core-7 6850k @ 4,0:

Memory total in MB-> 53350

CPU Time 1 iteration(sec) = 227.500
ELAPSED Time 1 iteration(sec) = 227.582
ELAPSED Time 1334.969

computational rate (mflops) for solve = 9143.7185 // 9162.1949
effective I/O rate (MB/sec) for solve = 34837.5672 // 34907.9622



6850K @ GTX 780 6 GB

""

------------------------------------


out of core solution

Epyc 7551 @ 2,55

Memory total in MB-> 18248

CPU Time 1 iteration(sec) = 531.719
ELAPSED Time 1 iteration(sec) = 561.629
ELAPSED Time 2763.203

computational rate (mflops) for solve = 2929.6081 // 656.7220
effective I/O rate (MB/sec) for solve = 11161.8068 // 2502.1107


Core-7 6850k @ 4,0:

Memory total in MB-> 8569

CPU Time 1 iteration(sec) = 262.656
ELAPSED Time 1 iteration(sec) = 283.993
ELAPSED Time 1487.516

computational rate (mflops) for solve = 1697.0636 // 736.2777
effective I/O rate (MB/sec) for solve = 6465.8122 // 2805.2182


--------------------------------------------------------------------------------------------------

Case V18sp-5
BGA
Transient, nonlinear, structural, 3 cumulative iteration
Solver "sparse, real-value, symmetric"
2073970 nodes ; 1266427 elements


in core solution


Epyc 7551 @ 2,55

Memory total in MB-> 87124

CPU Time 1 iteration(sec) = 337.578
ELAPSED Time 1 iteration(sec) = 338.200
ELAPSED Time 1209.125

computational rate (mflops) for solve = 18414.8072 // 18384.2639
effective I/O rate (MB/sec) for solve = 70160.4145 // 70044.0444


Core-7 6850k @ 4,0:

Memory total in MB-> 73015

CPU Time 1 iteration(sec) = 276.281
ELAPSED Time 1 iteration(sec) = 277.488
ELAPSED Time 1011.047

computational rate (mflops) for solve = 8489.0170 // 8474.9668
effective I/O rate (MB/sec) for solve = 32343.1543 // 32289.6233




Epyc 7551 vs 6850k without GPU acceleration / 2:2
Epyc 7551 vs 6850k with GPU acceleration / 1:3


The GTX 780 have to small memory size for the cases.

Last edited by Duke711; August 22, 2018 at 16:15.
Duke711 is offline   Reply With Quote

Old   August 19, 2018, 07:50
Default
  #2
Super Moderator
 
flotus1's Avatar
 
Alex
Join Date: Jun 2012
Location: Germany
Posts: 3,427
Rep Power: 49
flotus1 has a spectacular aura aboutflotus1 has a spectacular aura about
Thanks for your efforts. A little more information about the systems would be nice.
Operating system, Memory configuration , SMT settings, drive used for out-of-core...
flotus1 is offline   Reply With Quote

Old   August 19, 2018, 12:04
Default
  #3
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Quote:
Originally Posted by flotus1 View Post
Thanks for your efforts. A little more information about the systems would be nice.
Operating system, Memory configuration , SMT settings, drive used for out-of-core...

Win10


Epyc @ 8x 32 GB 2400 Mhz // SMT on // 5x Samsung 970 Evo Raid 0

6850K @ 8x 16 GB 2400 Mhz // SMT on // 5x Samsung 970 Evo Raid 0




https://www.dropbox.com/s/0jt7sm7irl...annt2.jpg?dl=0
Duke711 is offline   Reply With Quote

Old   August 19, 2018, 12:08
Default
  #4
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
The performance of modern CPUs is stunning and reminds me how dated my workstation is...

My system is:
Win10
Supermicro X8DTi-F
2x Xeon X5670
96GB DDR3 ECC RAM / 1333Mhz (2x 3x16GB modules)
SMT-off
500GB Samsung 860EVO SSD drive

And here are my results for the first benchmark:

Case V18cg-1
Power Supply Module
Static, linear, thermal
Solver "JCG, real value, symmetric"
5266730 nodes ; 2303613 elements

Equation solver computational rate: 3.4 Gflops
TOTAL PCG SOLVER SOLUTION CP TIME = 1159.828 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 1178.000 secs


Case V18cg-2
Tractor Rear Axle
Static, linear, structural
Solver "PCG, real-value, symmetric, msave,off"
12329235 nodes ; 2366046 elements

Multiply with A Memory Bandwidth : 46.65 GB/s
Multiply with A GFLOP Rate : 5.01 GFlops
Solve With Precond GFLOP Rate : 4.37 GFlops
Precond Factoring GFLOP Rate : 20.42 GFlops
CP Time(sec) = 822.375
Elapsed Time (sec) = 856.000
Echidna is offline   Reply With Quote

Old   August 22, 2018, 16:13
Default
  #5
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Update



2x Epyc 7301 @ 2,7





Case V18cg-1
Power Supply Module
Static, linear, thermal
Solver "JCG, real value, symmetric"
5266730 nodes ; 2303613 elements
Memory total in MB-> 37000


2x Epyc 7301 @ 2,7

TOTAL PCG SOLVER SOLUTION CP TIME = 367.422 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 367.438 secs


------------------------------------------------------------



Case V18cg-2
Tractor Rear Axle
Static, linear, structural
Solver "PCG, real-value, symmetric, msave,off"
12329235 nodes ; 2366046 elements
Memory total in MB-> 39000


2x Epyc 7301 @ 2,7

Multiply with A Memory Bandwidth : 166.65 GB/s
Multiply with A GFLOP Rate : 17.89 GFlops
Solve With Precond GFLOP Rate : 13.73 GFlops
Precond Factoring GFLOP Rate : 35.39 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 189.09 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 189.42 secs


--------------------------------------------------------------------




Case V18sp-4
Turbine
Static, nonlinear, structural, 5 cumulative iteration
Solver "sparse, real-value, symmetric"
1063750 nodes ; 726480 elements


in core solution


2x Epyc 7301 @ 2,7

Memory total in MB-> 63824

CPU Time 1 iteration(sec) = 294.906
ELAPSED Time 1 iteration(sec) = 295.166
ELAPSED Time 1600.359


--------------------------





SMT off:




Case V18cg-1


TOTAL PCG SOLVER SOLUTION CP TIME = 374.16 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 375.54 secs
Multiply with A Memory Bandwidth : 84.69 GB/s
Multiply with A GFLOP Rate : 9.09 GFlops


---------------------------------------





Case V18cg-2




TOTAL PCG SOLVER SOLUTION CP TIME = 230.58 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 232.76 secs
Multiply with A Memory Bandwidth : 129.47 GB/s
Multiply with A GFLOP Rate : 13.90 GFlops
Solve With Precond GFLOP Rate : 12.64 GFlops
Precond Factoring GFLOP Rate : 41.05 GFlops







------------------------------


Case V18sp-4



DSP Matrix Solver CPU Time (sec) = 280.375
DSP Matrix Solver ELAPSED Time (sec) = 281.834
1599.297
Duke711 is offline   Reply With Quote

Old   August 23, 2018, 01:58
Default
  #6
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
How is possible that the i7-6850k is that much faster than a dual Xeon X5670? Is that all down to DDR4 memory that makes the difference?
Realistically, should i expect a 2x E5-2690V2 (20 cores in total but DDR3 RAM) system to be even faster than the 6850k (6 cores)?

6850k
Multiply with A Memory Bandwidth : 86.34 GB/s
Multiply with A MFLOP Rate : 9270.32 MFlops
Solve With Precond MFLOP Rate : 8265.42 MFlops
Precond Factoring MFLOP Rate : 45436.20 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 373.34 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 373.88 secs

2x X5670
Multiply with A Memory Bandwidth : 46.65 GB/s
Multiply with A GFLOP Rate : 5.01 GFlops
Solve With Precond GFLOP Rate : 4.37 GFlops
Precond Factoring GFLOP Rate : 20.42 GFlops
CP Time(sec) = 822.375
Elapsed Time (sec) = 856.000
Echidna is offline   Reply With Quote

Old   August 23, 2018, 02:44
Default
  #7
Senior Member
 
Simbelmynë's Avatar
 
Join Date: May 2012
Posts: 552
Rep Power: 16
Simbelmynë is on a distinguished road
Quote:
Originally Posted by Echidna View Post
How is possible that the i7-6850k is that much faster than a dual Xeon X5670? Is that all down to DDR4 memory that makes the difference?
Realistically, should i expect a 2x E5-2690V2 (20 cores in total but DDR3 RAM) system to be even faster than the 6850k (6 cores)?

6850k
Multiply with A Memory Bandwidth : 86.34 GB/s
Multiply with A MFLOP Rate : 9270.32 MFlops
Solve With Precond MFLOP Rate : 8265.42 MFlops
Precond Factoring MFLOP Rate : 45436.20 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 373.34 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 373.88 secs

2x X5670
Multiply with A Memory Bandwidth : 46.65 GB/s
Multiply with A GFLOP Rate : 5.01 GFlops
Solve With Precond GFLOP Rate : 4.37 GFlops
Precond Factoring GFLOP Rate : 20.42 GFlops
CP Time(sec) = 822.375
Elapsed Time (sec) = 856.000

Assuming mechanical calculations behave similar to CFD calculations then:



The X5670 system has a total of 6 memory channels with 1333 MHz memory.


The 6850k system has a total of 4 memory channels with 2400 MHz memory.


So the 6850k has more available bandwidth.



I guess none of the systems are completely bandwidth limited though. But the Broadwell based system should have much better IPC compared to the Westmere system and it is also higher clocked.


A used dual 2690v2 seems to be a sweet-spot in terms of price performance right now, at least for CFD.
Simbelmynë is offline   Reply With Quote

Old   August 23, 2018, 11:35
Default
  #8
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Effects of SMT

CG-2 mechanical benchmark
2x X5670
96GB ECC DDR3 1333Mhz / 2 x 3 x 16GB DIMMs
SMT-OFF
Multiply with A Memory Bandwidth : 54.65 GB/s
Multiply with A GFLOP Rate : 5.87 GFlops
Solve With Precond GFLOP Rate : 4.92 GFlops
Precond Factoring GFLOP Rate 23.03 GFlops :
CP Time(sec) = 744.312
Elapsed Time (sec) = 753.000


CG-2 mechanical benchmark
2x X5670
96GB ECC DDR3 1333Mhz / 2 x 3 x 16GB DIMMs
SMT-ON
Multiply with A Memory Bandwidth : 65.05 GB/s
Multiply with A GFLOP Rate : 6.98 GFlops
Solve With Precond GFLOP Rate : 5.83 GFlops
Precond Factoring GFLOP Rate24.09 GFlops
CP Time(sec) = 674.297
Elapsed Time (sec) = 675.000
Echidna is offline   Reply With Quote

Old   April 3, 2019, 11:11
Default
  #9
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Update


Case V18cg-1
Power Supply Module
Static, linear, thermal
Solver "JCG, real value, symmetric"
5266730 nodes ; 2303613 elements
Memory total in MB-> 37000



2x E5 2670 0 @ 3 // pc 12800

Multiply with A Memory Bandwidth : 57.99 GB/s
Multiply with A MFLOP Rate : 6.23 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 567.13 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 568.62 secs





Case V18cg-2
Tractor Rear Axle
Static, linear, structural
Solver "PCG, real-value, symmetric, msave,off"
12329235 nodes ; 2366046 elements
Memory total in MB-> 39000


2x E5 2670 0 @ 3 // pc 12800

Multiply with A Memory Bandwidth : 104.42 GB/s
Multiply with A MFLOP Rate : 11.21 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 320.83 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 324.05 secs




Case V18sp-4
Turbine
Static, nonlinear, structural, 5 cumulative iteration
Solver "sparse, real-value, symmetric"
1063750 nodes ; 726480 elements

2x E5 2670 0 @ 3 // pc 12800


CPU Time 1 iteration(sec) =
ELAPSED Time 1 iteration(sec) = 670.812
ELAPSED Time 3354.06

Last edited by Duke711; April 3, 2019 at 16:51.
Duke711 is offline   Reply With Quote

Old   July 21, 2019, 08:13
Default
  #10
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Case V18sp-4
Turbine
Static, nonlinear, structural, 5 cumulative iteration
Solver "sparse, real-value, symmetric"
1063750 nodes ; 726480 elements


2x X5670
96GB ECC DDR3 1333Mhz / 2 x 3 x 16GB
SMT-OFF



EQUIL ITER 1

CPU TIME = 507.9

ELAPSED TIME = 526.7


The solution converged in the 3rd iteration, so i can't compare the total elapsed time for 5 iterations.



How my system is so fast in this case?





In comparison:



2x E5 2670 0 @ 3 // pc 12800
ELAPSED Time 1 iteration(sec) = 670.812


2x Epyc 7301 @ 2,7
CPU Time 1 iteration(sec) = 294.906
ELAPSED Time 1 iteration(sec) = 295.166
Echidna is offline   Reply With Quote

Old   July 22, 2019, 05:54
Default
  #11
New Member
 
Erik
Join Date: Jul 2019
Posts: 7
Rep Power: 7
erik87 is on a distinguished road
It could be some missing ram or faulty populated ram for loosing fully functional octachannel memorybandwidth in the E5 system.
erik87 is offline   Reply With Quote

Old   July 22, 2019, 10:41
Default
  #12
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Quote:
Originally Posted by erik87 View Post
It could be some missing ram or faulty populated ram for loosing fully functional octachannel memorybandwidth in the E5 system.
That’s possible but still the dual Epyc’s performance is not that much better than my systems which is weird.
Echidna is offline   Reply With Quote

Old   July 29, 2019, 14:24
Default
  #13
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
Case V18sp-4
Turbine
Static, nonlinear, structural, 5 cumulative iteration
Solver "sparse, real-value, symmetric"
1063750 nodes ; 726480 elements


in core solution




2x E5 2670 0 @ 2,99 Ghz ; 1600 MHz


--smt off


100%

83/512 GB





CPU Time 1 iteration(sec) = 255.516
ELAPSED Time 1 iteration(sec) = 259.342
ELAPSED Time 1434.203




--smt on


59%

76/512 GB





CPU Time 1 iteration(sec) = 263.545
ELAPSED Time 1 iteration(sec) = 267.312
ELAPSED Time 1384.131






6850K @ 4,2 Ghz ; 3000 MHz


---smt on


61%
67/128 GB



CPU Time 1 iteration(sec) = 262.542
ELAPSED Time 1 iteration(sec) = 263.535
ELAPSED Time 1392.321





second run:


CPU Time 1 iteration(sec) = 360.516
ELAPSED Time 1 iteration(sec) = 366.171
ELAPSED Time 1989.312


smp; shared memory parallel

CPU Time 1 iteration(sec) = 1245.609
ELAPSED Time 1 iteration(sec) = 1257.387



6850K @ 4,0 Ghz ; 2400 MHz

CPU Time 1 iteration(sec) = 262.656
ELAPSED Time 1 iteration(sec) = 283.993
ELAPSED Time 1487.516



-> non linear is not a good benchmark case, elapsed time per iteration very different.



Sparse solver in core:

-> Number of core's has not a effect
-> CPU frequency has not a effect
-> Memory frequency has not a effect.


Because the memory bandwith, core per memory channel are very important:


Dual Epyc 7301 --> 2 core per channel


6850K --> 1,5 core per channel


Dual E5 2670 --> 2 core per channel


Single Epyc 7551 --> 4 core per channel and very slowly


7551 best results:


CPU Time 1 iteration(sec) = 531.719
ELAPSED Time 1 iteration(sec) = 561.629
ELAPSED Time 2763.203
Duke711 is offline   Reply With Quote

Old   July 29, 2019, 14:43
Default
  #14
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
I will make the same benchmark with a dual E5-2650V2 setup soon to see how it performs.


Epyc7551 is very slow and i think this is due to its low base frequency.

I think it would perform way better in a larger case. You could run the sp5 benchmark which is more demanding.



It would be interesting to see how an Intel Scalable setup performs on these FEA benchmarks.
Echidna is offline   Reply With Quote

Old   July 29, 2019, 15:04
Default
  #15
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
I think the best Setup for



JCG (thermal) Solver is


- eight core cpu with a Tesla and lot of vram, in fea the graphic card run only in medium or low power mode





PCG Solver or iterative is


- big cpu with a lot of core's




Sparse Solver or direct is



- six core cpu with low power consumption




For fea is a multi cpu plattform with very high power consumption not a good idea. Only for CFD.
Duke711 is offline   Reply With Quote

Old   July 29, 2019, 15:21
Default
  #16
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Duke,

I make most of my FEA runs in iterative mode just because i would need tons of RAM to run them on sparse.

So, i think that for large FEA cases where you most often use iterative mode, multi-core CPUs or multi-CPU platform is the way to go.

For smaller FEA cases what you need is a fast 6 or 8 core CPU.
Echidna is offline   Reply With Quote

Old   July 29, 2019, 16:47
Default
  #17
Member
 
Join Date: Dec 2016
Posts: 44
Rep Power: 9
Duke711 is on a distinguished road
The convergence by the interative or PCG solver ist not guatanteed and for non linear problems or high element distortion the sparse solver is much more stable and faster then the PCG solver. PCG solver is good and fast for linear or simple problems.
Duke711 is offline   Reply With Quote

Old   November 28, 2019, 08:11
Default
  #18
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Case V18sp-4
Turbine
Static, nonlinear, structural, 5 cumulative iteration
Solver "sparse, real-value, symmetric"
1063750 nodes ; 726480 elements

in core solution

---------------------------------------------------------------------------------------------------------------------------

2x E5-2650V2 @ 3Ghz (16cores total) (For some reason my CPUs stopped at 3GHz max and didn't go to 3.2GHz)
128GB DDR3 RAM (not perfectly populated due to MB issues)

Total memory allocated for solver = 61088.564 MB

ITER 1 / DSP Matrix Solver / CPU Time (sec) = 219.969
ITER 1 / DSP Matrix Solver / ELAPSED Time (sec) = 220.744
EQUIL ITER 1 CPU TIME = 231.7
ELAPSED TIME = 232.6

TOTAL ELAPSED TIME FOR 5 iterations = 1597.000

Equation solver computational rate: 151.0 Gflops
Equation solver effective I/O rate: 2.4 GB/sec

-----------------------------------------------------------------------------------------------------------

If we compare this to Duke711's benchmark using 2x Epyc7301 we will be a bit shocked.

2x Epyc 7301 @ 2,7 (32cores total)

CPU Time 1 iteration(sec) = 294.906
ELAPSED Time 1 iteration(sec) = 295.166
ELAPSED Time 1600.359
Echidna is offline   Reply With Quote

Old   November 28, 2019, 08:38
Default
  #19
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Case V18cg-1
Power Supply Module
Static, linear, thermal
Solver "JCG, real value, symmetric"
5266730 nodes ; 2303613 elements

----------------------------------------------------------------------------

2x E5-2650V2 @ 3Ghz (16cores total) (For some reason my CPUs stopped at 3GHz max and didn't go to 3.2GHz)
128GB DDR3 RAM (not perfectly populated due to MB issues)

Multiply with A Memory Bandwidth : 82.14 GB/s
Multiply with A MFLOP Rate : 8.82 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 463.53 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 463.91 secs

-----------------------------------------------------------------------------------

(DUKE711)

Epyc 7551 @ 2,55

Multiply with A Memory Bandwidth : 101.03 GB/s
Multiply with A MFLOP Rate : 10848.33 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 334.27 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 334.49 secs


2x Epyc 7301 @ 2,7

TOTAL PCG SOLVER SOLUTION CP TIME = 367.422 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 367.438 secs


Core-7 6850k @ 4,0:

Multiply with A Memory Bandwidth : 57.44 GB/s
Multiply with A MFLOP Rate : 6167.20 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 567.59 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 568.02 secs
Echidna is offline   Reply With Quote

Old   November 28, 2019, 09:08
Default
  #20
Member
 
Join Date: Jun 2010
Posts: 77
Rep Power: 16
Echidna is on a distinguished road
Case V18cg-2
Tractor Rear Axle
Static, linear, structural
Solver "PCG, real-value, symmetric, msave,off"
12329235 nodes ; 2366046 elements

----------------------------------------------------------------------------

2x E5-2650V2 @ 3Ghz (16cores total) (For some reason my CPUs stopped at 3GHz max and didn't go to 3.2GHz)
128GB DDR3 RAM (not perfectly populated due to MB issues)


Multiply with A Memory Bandwidth : 86.31 GB/s
Multiply with A MFLOP Rate : 9.27 GFlops
Solve With Precond MFLOP Rate : 7.87 GFlops
Precond Factoring MFLOP Rate : 30.77 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 373.69 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 375.29 secs

----------------------------------------------------------------------------------

DUKE711

2x Epyc 7301 @ 2,7
Multiply with A Memory Bandwidth : 166.65 GB/s
Multiply with A GFLOP Rate : 17.89 GFlops
Solve With Precond GFLOP Rate : 13.73 GFlops
Precond Factoring GFLOP Rate : 35.39 GFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 189.09 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 189.42 secs


Core-7 6850k @ 4,0:
Multiply with A Memory Bandwidth : 86.34 GB/s
Multiply with A MFLOP Rate : 9270.32 MFlops
Solve With Precond MFLOP Rate : 8265.42 MFlops
Precond Factoring MFLOP Rate : 45436.20 MFlops
TOTAL PCG SOLVER SOLUTION CP TIME = 373.34 secs
TOTAL PCG SOLVER SOLUTION ELAPSED TIME = 373.88 secs
Echidna is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
ANSYS Mechanical - change convergence criteria KeganLeckness Structural Mechanics 2 May 12, 2022 06:57
ANSYS Mechanical APDL mechanicaldesign ANSYS 2 December 30, 2018 03:44
Ansys Mechanical in batch mode for multiple analysis bsotelo ANSYS 1 April 5, 2017 15:47
Exporting fluent melting/solidification model results to ansys mechanical apdl omair11 ANSYS 3 August 28, 2013 07:22
Ansys mechanical APDL launcher license for CFX mohammad CFX 1 June 30, 2011 10:15


All times are GMT -4. The time now is 16:18.