|
[Sponsors] |
January 25, 2003, 10:21 |
Distributed parallel error in CFX 5.5.1
|
#1 |
Guest
Posts: n/a
|
Dear All, I'm trying to run in distributed parallel mode on two pc-s (dual P4 2.4 GHz 2GB RAM each, Suse 8.1 Linux), using 8 partitions (4 on each machine), with a quite large model at the second step solver gives the following error time and time again. What should be done? Already had succesful calculations with the same conditions and a bigger model (more mesh elements)
OUTER LOOP ITERATION = 2 CPU SECONDS = 1.44E+03 ----------------------------------------------------- |Equation| Rate | RMS Res | Max Res | LinearSolution +----------------------+------+---------+---------+ +-------------------------------------------------- | ERROR #001100279 has occurred in subroutine ErrAction. | | Message: | Floating point exception: Type Unknown | +-------------------------------------------------+ +---------------------------------------------+ | ERROR #001100279 has occurred in subroutine ErrAction. |Message: | Stopped in routine c_fpx_handler | +---------------------------------------+ An error has occurred in cfx5solve: The CFX-5 solver has terminated without writing a results file. End of solution stage. This run of the CFX-5 Solver has finished. Any help would be appreciated |
|
January 25, 2003, 13:35 |
Re: Distributed parallel error in CFX 5.5.1
|
#2 |
Guest
Posts: n/a
|
This doesn't look like anything to do with running in parallel, the solver has just overflowed. If you already have a solution on a finer mesh, then i'd suggest interpolating that solution onto your courser mesh (Tools > Interpolate i think from the Solver MAnager menu). This will give you a much better initial guess and it's much less likely that the solver will fail. Mike
|
|
January 25, 2003, 17:26 |
still Distributed parallel error in CFX 5.5.1
|
#3 |
Guest
Posts: n/a
|
still the same... any idea to solve this? THX What I don't understand is that the model is almost the same only with minor geometry modifcations as the other one. with the other one had no problems at all.
================================================== == OUTER LOOP ITERATION = 2 CPU SECONDS = 1.29E+03 | Equation | Rate | RMS Res | Max Res | Linear Solution | +----------------------+------+------ Parallel run: Received message from slave ----------------------------------------- Slave partition : 5 Slave routine : ErrAction Master location : RCVBUF,MSGTAG=1033 Message label : 001100279 Message follows below - : +------------------------------------------ | ERROR #001100279 has occurred in subroutine ErrAction. | Message: | | Floating point exception: Type Unknown | | +-----------------------------------------+ Parallel run: Received message from slave ----------------------------------------- Slave partition : 5 Slave routine : ErrAction Master location : RCVBUF,MSGTAG=1033 Message label : 001100279 Message follows below - : +----------------------------+ | ERROR #001100279 has occurred in subroutine ErrAction. | Message: | | Stopped in routine c_fpx_handler +-----------------------+ An error has occurred in cfx5solve: The CFX-5 solver has terminated without writing a results file. End of solution stage. This run of the CFX-5 Solver has finished. |
|
January 26, 2003, 12:05 |
Re: still Distributed parallel error in CFX 5.5.1
|
#4 |
Guest
Posts: n/a
|
OK I found it... Tell me, who's the stupid me or CFX: I had to slightly modify the geometry. I had thin surfaces, build generates 2 entries (for "both sides") for thin surf-s, as one can check it in post. But how come after my modifications - wich had nothing with the actual thin surfs -, it associates two absolutely diffrent surfaces for the second entry of the original thin surface...????? ... wich were actually exterior walls by default
anyway... |
|
January 26, 2003, 16:19 |
Re: Distributed parallel error in CFX 5.5.1
|
#5 |
Guest
Posts: n/a
|
Can you explain me why you use more partitions (8) than you have in processors (4)?
Pascale |
|
January 27, 2003, 06:46 |
Re: Distributed parallel error in CFX 5.5.1
|
#6 |
Guest
Posts: n/a
|
Yeh I didn't follow that part either. I always thought you had one partition per CPU ?? is this incorrect thinking ??
|
|
January 27, 2003, 19:22 |
Re: Distributed parallel error in CFX 5.5.1
|
#7 |
Guest
Posts: n/a
|
physically 1 CPU is logically 2. I don't understand it (this is our experience both on Suse Linux & WinXP) either (or does anyone?) and we experienced it a bit faster with 2 partitions/1physical processor
and a question about that: under win NT 4, with a P4 1.8 GHZ and 2 GB RAM, solver says "the problem does not fit in memory" when I start a model with 2,8 millions of mesh elements in SERIAL. Starting in LOCAL PARALLEL with 2 partitions it goes well though doesn't exceed the physical memory limit (using 1.8 GB out of 2) any explanation? lot of THX Bog |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
RSH problem for parallel running in CFX | Nicola | CFX | 5 | June 18, 2012 19:31 |
Core usage on CFX parallel processing | alterego | CFX | 6 | December 21, 2011 06:45 |
CFX command line to activate the HP MPI Distributed... | mohammad | CFX | 3 | July 7, 2011 11:34 |
CFX local parallel on windows XP | frank | CFX | 12 | April 24, 2008 08:26 |
CFX, NT parallel, Linux, best platform | Heiko Gerhauser | CFX | 1 | August 21, 2001 10:46 |