CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > CFX

Parallel speed up

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   May 21, 2002, 07:38
Default Parallel speed up
  #1
Soren
Guest
 
Posts: n/a
Hi,

Does anyone have experiance with dual processor computer and CFX-5.5 under linux ?

What kind of speed-up is common compared to singel processor ?

Thanks a lot.

Regards Soren

  Reply With Quote

Old   May 21, 2002, 12:01
Default Re: Parallel speed up
  #2
Holidays
Guest
 
Posts: n/a
I seem to remember that you can obtain a fairly linar relationship assuming you solve a large problem to dissolve the effects of the partitioning (I saw a CFX presentation), but contact your vendor since CFX is very likely to have done the comparison.
  Reply With Quote

Old   May 21, 2002, 13:02
Default Re: Parallel speed up
  #3
Neale
Guest
 
Posts: n/a
CFX-5.5 gets speedups of 1.6-1.8, depending on the problem size, in Linux. This is better on high end workstations, where 1.9-2.1 are typical. The memory and cache architectures on Intel/AMD Linux boxes are just not good enough to get comparable speedups.

Neale

  Reply With Quote

Old   May 22, 2002, 03:14
Default Re: Parallel speed up
  #4
Soren
Guest
 
Posts: n/a
Hi

Thanks for the reply.

I know that under Windows NT/2k/XP the parallel performance of a dual processor computer is very bad.

The speed up is about 1.1 to 1.2.

Thats why I am looking at Linux.

Any comment ?

Regards

Soren

  Reply With Quote

Old   May 22, 2002, 04:41
Default Re: Parallel speed up
  #5
Astrid
Guest
 
Posts: n/a
Using CFX 5.5 on a Pentium-IV with WinNT, we obtained a speed-up of about 1.8-2.0. But, we have only tested it up to 4 PC's.

Astrid
  Reply With Quote

Old   May 22, 2002, 05:43
Default Re: Parallel speed up
  #6
Soren
Guest
 
Posts: n/a
Hi Astrid

It the computer singel or dual processor ?

Regards

Soren
  Reply With Quote

Old   May 22, 2002, 08:44
Default Re: Parallel speed up
  #7
cfd guy
Guest
 
Posts: n/a
I use TASCflow and CFX-5.5 in a Dual PIII PC. I've noted a speed-up about 1.4 - 1.6 in CFX-5 and 1.6 - 1.8 in TASCflow, depending the problem size. I only ran local parallel with two partitions.
cfd guy
  Reply With Quote

Old   May 22, 2002, 16:46
Default Re: Parallel speed up
  #8
Neale
Guest
 
Posts: n/a
Linux seems to generally do a better job at dynamic process managment (i.e., multitasking) so you see slightly better speedups there usually. I've typically seen on the order of 1.4-1.6 on NT, and 1.6-1.8 on Linux for CFX-5.5.

Neale.

  Reply With Quote

Old   May 22, 2002, 16:49
Default Re: Parallel speed up
  #9
Neale
Guest
 
Posts: n/a
Astrid,

Do you mean you ran a 4 process job on 4 PCs and only got 1.8 -> 2.0 speedup??? What problem size were you running? For a 4 process job you would need at least 400,000->600,000 elements to see a decent speedup.

Neale
  Reply With Quote

Old   May 23, 2002, 03:11
Default Re: Parallel speed up
  #10
Jens
Guest
 
Posts: n/a
Hi

I am curious about these speed-up. I am running indoor and HVAC problems with mesh size from 400k-2.000k on a Win NT box with dual P4 processor.

The speed up I am getting is below 1.2.

Are you appling something special ?

Thanks

Regards

Jens
  Reply With Quote

Old   May 23, 2002, 12:22
Default Re: Parallel speed up
  #11
Robin
Guest
 
Posts: n/a
Hi Jens,

How much RAM usage do you have. For a 2 million node problem, I'd be surprised if you were not running into swap space. In this case, you will see the best speedup if you run it on multiple systems, at least enough to get it all in RAM and out of swap.

Robin
  Reply With Quote

Old   May 23, 2002, 14:43
Default Re: Parallel speed up
  #12
Jens
Guest
 
Posts: n/a
Hi

I have benchmark using a HVAC problem with 600.000 cell. The speed up was 1.15 on a dual P4 with 1.2 Gb Ram.

Any hints ?

Regards

Jens
  Reply With Quote

Old   May 24, 2002, 11:05
Default Re: Parallel speed up
  #13
Neale
Guest
 
Posts: n/a
How were you calculating the speedup? You should use the CFD start and finish times in the output file.

600,000 cells means roughly 120,000 nodes (for a tet grid I assume), which should only take about 180MB-200MB for uvwp-k-eps. So, swapping probably isn't an issue.

Make sure you do your performance measurements on a "clean" machine. i.e., you aren't running anything else or doing anything else other than the CFD calculation.

Neale.
  Reply With Quote

Old   May 27, 2002, 12:56
Default Architetures Benchmark
  #14
cfd guy
Guest
 
Posts: n/a
Hi Jens,
As this discussion is very interesting, I'd like to propose you the following benchmark. It would be very interesting that users could share their speedups information. I've built a very simple case (rectangular channel) with approximately 960K cells (hybrid mesh with inflation). I've performed this definition file in a SUN Workstation running on Solaris 8 with 4 processors. Here's some data about this case:
3D, Turbulent (k-eps), Incompressible(Air) and Steady State flow. Number of Cells: Almost 948,000
Run - - - - - - Speedup Serial -----> 1. 2 proc. -----> 2.08 3 proc. -----> 3.03 4 proc. -----> 4.02


Why don't you test in your NT machine? I could send you the journal file so that you could easily obtain this definition file. If anyone else wants the journal file, please feel free to mail me.


PS1.: Make sure you're not running any other applications in your machine. PS2.: Rebuilding the journal file in my NT machine the resulting mesh has 947,916 elements. However when rebuilding it in my UNIX system, the resulting file has 948,161 elements. I believe that will be no problem at all for benchmarking purposes. PS3: I think it's the simplest case you could ever imagine. It's a simple geometry with no geometric bad angles and no grid interfaces (monoblock). I believe that the speedup also depends on some geometric information.

Kind regards, cfd guy
  Reply With Quote

Old   May 28, 2002, 14:02
Default Re: Architetures Benchmark
  #15
cfd guy
Guest
 
Posts: n/a
About my previous post. I've tested two coarser grids in comparison with the 1st one. Here the results:
Grid 2: 468.500 elements. SpeedUp: (2 processes) = 1.92 SpeedUp: (3 processes) = 2.74 SpeedUp: (4 processes) = 3.41
Grid 3: 109.400 elements. SpeedUp: (2 processes) = 1.49 SpeedUp: (3 processes) = 2.15 SpeedUp: (4 processes) = 2.80
I'm not trying to find out the optimal mesh size for this problem, but it seems that in this case, the minimum number of elements for each processor must be greater than 200k cells to obtain a linear relation between the speedup and the number of processes.
cfd guy
  Reply With Quote

Old   May 30, 2002, 17:38
Default Re: Parallel speed up
  #16
Astrid
Guest
 
Posts: n/a
Soren,

We used 4 distributed parallel PC's in a 100bt network.

Astrid
  Reply With Quote

Old   May 30, 2002, 17:44
Default Re: Parallel speed up
  #17
Astrid
Guest
 
Posts: n/a
Sorry, I was wrong. I didn't mean to confuse you.

We ran a job with ± 1.5M elements on 1 standalone PC and on 2 distributed parallel PC's. Then, speed up was 1.8-2.0. With 4 PC's, speed up was about 3.6.

Astrid
  Reply With Quote

Old   May 30, 2002, 17:47
Default Re: Architetures Benchmark
  #18
Robin
Guest
 
Posts: n/a
cfd guy,

The number of nodes is more relevant to parallel efficiency. Can you post the number of nodes in your mesh rather than elements?

Typically, the best efficiency is achieved when the number of nodes per partition is greater than 100k. At less than 20k per partition the trend may reverse, taking longer with added partitions (due to increased communication).

Robin
  Reply With Quote

Old   May 31, 2002, 13:26
Default Re: Architetures Benchmark
  #19
Neale
Guest
 
Posts: n/a
Actually, it's ok to quote by elements as well. They are related anyways (roughly 1:1 for hex grids, and 5-6:1 for tet/hybrid grids). In fact the assembly really scales with the number of elements anyways as the CFX-5 solver uses an element based assembly.

I'm not suprised by the results though, as 50,000 vertices per partition translates into 200k elements on a tet/hybrid grid. This is what we see in parallel results as well.

Neale.

  Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
IcoFoam in parallel Issue with speed up hjasak OpenFOAM Running, Solving & CFD 19 October 11, 2011 18:07
Increase speed of parallel computation Purushothama Siemens 2 November 30, 2010 15:51
Parallel with Windoze, speed difference between PV Charles CFX 3 March 10, 2005 02:25
Parallel speed up for CFX 5 on PC's Roued CFX 6 November 28, 2001 19:02
speed up ratio at parallel processing Kim hak-gyu Main CFD Forum 1 October 25, 2000 10:57


All times are GMT -4. The time now is 13:09.