CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS > FLUENT

Cluster error: Fatal error in one of the compute processes

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   January 4, 2023, 06:02
Default Cluster error: Fatal error in one of the compute processes
  #1
New Member
 
Karla Jacinto
Join Date: Oct 2020
Posts: 5
Rep Power: 6
karlag is on a distinguished road
Hi,

I'm running my jobs in a cluster and this is not the first time, that, after some time steps, without any error, this message appears:


===============Message from the Cortex Process================================

Fatal error in one of the computing processes.

================================================== ============================

Usually after the time-step convergence.
There are other log files (fluent-error.log), where it is possible to find more information that I don't understand, as:

myid (30): Fatal signal raised sig = SIGIOT
/soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/lnamd64/3ddp_node/fluent_mpi.20.1.0() [0x225cabc]
/lib64/libpthread.so.0(+0xf630) [0x7f29717ff630]
/lib64/libc.so.6(gsignal+0x37) [0x7f2967b24387]
/lib64/libc.so.6(abort+0x148) [0x7f2967b25a78]
/lib64/libc.so.6(+0x78f67) [0x7f2967b66f67]
/lib64/libc.so.6(+0x81329) [0x7f2967b6f329]
/soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/mpi/lnamd64/ibmmpi/lib/linux_amd64/libmpi.so.1(free+0x2e) [0x7f296641eece]
/lib64/libc.so.6(+0x39d10) [0x7f2967b27d10]
/lib64/libc.so.6(+0x39d37) [0x7f2967b27d37]
/soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x7996a) [0x7f297473796a]
/soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x79a85) [0x7f2974737a85]
/soft/ANSYS/2020R1/ansys_inc/v201/fluent/fluent20.1.0/multiport/lnamd64/mpi/shared/libmport.so(+0x81b5a) [0x7f297473fb5a]
/lib64/libpthread.so.0(+0x7ea5) [0x7f29717f7ea5]
/lib64/libc.so.6(clone+0x6d) [0x7f2967becb0d]
myid (30): Fatal signal raised sig = SIGSEGV

There is anyone that knows how to solve this?

Thanks for your help.
karlag is offline   Reply With Quote

Old   March 4, 2024, 11:14
Default
  #2
New Member
 
Christoph D
Join Date: Mar 2024
Posts: 1
Rep Power: 0
Christoph_D is on a distinguished road
Did you find a solution for this problem? I have exactly the same problem (also on a cluster after simulating for some time, using ANSYS FLUENT 2022R2).
Christoph_D is offline   Reply With Quote

Reply

Tags
cluster, cortex, error, fluent, hpc


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Fatal error in one of the compute processes xiaopang FLUENT 0 December 29, 2022 01:55
StarCCMS+ on AWS Parallel Cluster not distributing workload across multiple nodes dwagoner STAR-CCM+ 3 May 25, 2021 03:39
Compute Cluster with diskless compute nodes Pauli Hardware 0 October 6, 2015 17:48
Cluster ID's not contiguous in compute-nodes domain. ??? Shogan FLUENT 1 May 28, 2014 16:03
Parallel PHOENICS using Microsoft Compute Cluster Asish Sinha Phoenics 2 June 6, 2008 10:32


All times are GMT -4. The time now is 11:23.