CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > FLUENT

Fluent benchmakrs on new Intel CPUs

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   February 15, 2008, 04:03
Default Fluent benchmakrs on new Intel CPUs
  #1
cfdmystic
Guest
 
Posts: n/a
Hi there,

I am currently evaluating hardware for running Fluent and came across this very impressive "World record" from Sun.

http://blogs.sun.com/bmseer/entry/fl...records_on_sun

Unfortunately this is a rather impressive piece of marketing baloney - perhaps I am a bit harsh - but it does deserve a closer look (comment on the blog is closed or I would have taken it up with the author directly).

In my world we have to pay for each box and each CPU and each core. So when we run application we want each of the cores to work. I speculate that this benchmark was achieved by running 1 core (out 4 in the box) on 4 different machines connected via Infiniband - leaving 12 cores useless/inactive. The 8 core result is achieved by using 2 (out of the 4) on each of the 4 boxes - leaving 8 inactive cores. This produces impressive scaling figures and high numbers - with no use for real world (my world) applications.

Perhaps if the author would just report his benchmark in the standard way where the number of boxes or chips are included (e.g. on Fluent website or for spec.org CPU benchmarks) he would have been able to give us something useful.

More worrying to me is the results for 16 cores - a closer look shows a 54% parallel scaling efficiency. So when all boxes are fully loaded, the performance is miserable.

So far my search for the right solution has been time-consuming and frustrating. The unavailability of Barcelona benchmarks (spanning more than one box) and the unavailability of the CPUs in general points towards 2 options: (1) New Intel technology (Harpertown/Wolfdale) with its memory bandwith bottleneck or (2) the older AMD dual cores. Pricing seems to indicate that option (2) is better because the high-end CPUs (3.0GHz and 3.2GHz) are ridiculously expensive.

In summary: For smaller parallel jobs (up to 16 cores) the Intel CPUs are best, but at larger sizes (40+ cores) it seems that AMD remains king - even with lower CPU clockspeeds. If anyone can add alternative ways to look at this, it would be greatly appreciated.

May convergence be with you.

DISCLAIMER: I am not affiliated with any hardware or software vendors.
  Reply With Quote

Old   February 15, 2008, 07:28
Default Re: Fluent benchmakrs on new Intel CPUs
  #2
Phil
Guest
 
Posts: n/a
I just built a 16 core rig (8x dual core Wolfdale 45 nm E8400s) running linux 64 bit, standard gige interconnect.

Running wind tunnel DES sims gives excellent per core utilisation of 95-100%. It was absolutely CRUCIAL to use a suitable partition method (principle Z axis in my case) which cuts the domain like a sliced salami. Metis gave appalling scaling.

Some basics: Dont even think of using intel quads ... scaling is appalling for 3+ cores. Look at the benchmarks on Fluent's site. Barcelona is a dud. Its performance at 2.2-2.4 GHZ is piss poor even if it (hopefully) will scale well. My 3.8 GHZ dual wolfdales (3 GHZ stock OCed to 3.8 GHZ) would run rings around any quad Barcelona. Intel dual cores scale very well over ethernet IF you are very thorough in evaluating the various partitioning methods. In my experience minimizing the number of partition neighbors is much more important than minimizing the interface cell ratios. If you are aiming for < 24 cores I wouldn't bother with a low latency interconnect unless you know your problems will be difficult to partition efficiently. For a dedicated cluster, stringing together Wolfdales is cheaper and faster than stringing together Harpertown dual socket systems.

CFX's parallel performance is much better than Fluent's. With Fluent you really have to configure the problem well to get good scaling. With CFX you have to really bungle things to get poor scaling. For the exact same mesh and case definition CFX gave me ~100% scaling efficiency over 16 cores, Fluent's was ~70% ... which corresponds to Fluents published benchmarks.

On the horizon, Intels upcoming Nehalem looks to be God like in terms of raw speed and multicore scaling.
  Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Two questions on Fluent UDF Steven Fluent UDF and Scheme Programming 7 March 23, 2018 04:22
problem of running parallel Fluent on linux cluster ivanbuz FLUENT 15 September 23, 2017 20:12
Abaqus - Fluent Coupling WITHOUT MPCCI s.mishra FLUENT 1 April 5, 2016 07:47
Parallel Fluent: trouble going from 2 to 4 CPUs Mario FLUENT 6 August 24, 2006 01:17
intel platforms to fluent. Carlos Latorre FLUENT 5 January 11, 2000 10:08


All times are GMT -4. The time now is 05:15.