|
[Sponsors] |
What cause the below of fluent calculation in cluster?It just happens abruptly. How t |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
July 14, 2022, 05:47 |
What cause the below of fluent calculation in cluster?It just happens abruptly. How t
|
#1 |
Senior Member
Join Date: Dec 2017
Posts: 388
Rep Power: 9 |
What cause the below of fluent calculation in cluster?It just happens abruptly. How to solve it ?
Fatal error has happened to some of the processes! Exiting ... ===============Message from the Cortex Process================================ Fatal error in one of the compute processes. ================================================== ============================ ================================================== ============================ Stack backtrace generated for process id 2717 on signal 11 : *** Error in `fluent': corrupted double-linked list: 0x0000000001f847c0 *** ======= Backtrace: ========= /usr/lib64/libc.so.6(+0x7bd95)[0x2af635affd95] /usr/lib64/libc.so.6(+0x7de35)[0x2af635b01e35] /usr/lib64/libc.so.6(__libc_malloc+0x4c)[0x2af635b0387c] /usr/lib64/libc.so.6(__backtrace_symbols+0x10e)[0x2af635b8e33e] fluent(print_back_trace_to_file+0x5a)[0x68d76a] *** Error in `fluent': corrupted double-linked list: 0x0000000001f84760 *** fluent[0x67f3b9] /usr/lib64/libc.so.6(+0x35670)[0x2af635ab9670] /usr/lib64/libc.so.6(+0x38dcd)[0x2af635abcdcd] ======= Backtrace: ========= /usr/lib64/libc.so.6(+0x38eb5)[0x2af635abceb5] fluent[0x677829] /usr/lib64/libc.so.6(+0x35670)[0x2af635ab9670] /usr/lib64/libc.so.6(__select+0x33)[0x2af635b71943] fluent[0x65eb86] fluent(lreadf+0x29)[0x6e1b99] /usr/lib64/libc.so.6(+0x7bd95)[0x2af635affd95] /usr/lib64/libc.so.6(+0x7cec6)[0x2af635b00ec6] /opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(+0x9bfbc)[0x2af634afafbc] /opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(+0x9ca27)[0x2af634afba27] /opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(PyDict_SetItem+0x67)[0x2af634afd487] fluent(eval+0x497)[0x6db8d7] /opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(_PyModule_Clear+0x14c)[0x2af634b015bc] /opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(PyImport_Cleanup+0x24f)[0x2af634b8288f] /opt/application/ansys19/v192/fluent/../commonfiles/CPython/2_7_13/linx64/Release/python/lib/libpython2.7.so.1.0(Py_Finalize+0xfe)[0x2af634b948de] /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libExpr.so(_ZN13PyInitializerD1Ev+0x6)[0x2af630f3a716] /usr/lib64/libc.so.6(__cxa_finalize+0x9a)[0x2af635abd1da] /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libExpr.so(+0x635c3)[0x2af630ecb5c3] ======= Memory map: ======== 00400000-0124d000 r-xp 00000000 00:26 425291452 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/cortex.19.2.0 0144d000-01477000 r--p 00e4d000 00:26 425291452 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/cortex.19.2.0 01477000-014fb000 rw-p 00e77000 00:26 425291452 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/cortex.19.2.0 014fb000-0164d000 rw-p 00000000 00:00 0 01eb2000-022cb000 rw-p 00000000 00:00 0 [heap] 2af62e099000-2af62e0ba000 r-xp 00000000 08:03 17043974 /usr/lib64/ld-2.17.so 2af62e0ba000-2af62e219000 rw-p 00000000 00:00 0 2af62e219000-2af62e220000 r--s 00000000 08:03 17305077 /usr/lib64/gconv/gconv-modules.cache 2af62e220000-2af62e299000 rw-p 00000000 00:00 0 2af62e29a000-2af62e29b000 rw-p 00000000 00:00 0 2af62e2ba000-2af62e2bb000 r--p 00021000 08:03 17043974 /usr/lib64/ld-2.17.so 2af62e2bb000-2af62e2bc000 rw-p 00022000 08:03 17043974 /usr/lib64/ld-2.17.so 2af62e2bc000-2af62e2bd000 rw-p 00000000 00:00 0 2af62e2bd000-2af62e550000 r-xp 00000000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so 2af62e550000-2af62e74f000 ---p 00293000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so 2af62e74f000-2af62e755000 r--p 00292000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so 2af62e755000-2af62e7aa000 rw-p 00298000 00:26 867021368 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libimf.so 2af62e7aa000-2af62f489000 r-xp 00000000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so 2af62f489000-2af62f688000 ---p 00cdf000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so 2af62f688000-2af62f6c3000 r--p 00cde000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so 2af62f6c3000-2af62f6c8000 rw-p 00d19000 00:26 867021450 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libsvml.so 2af62f6c8000-2af62f730000 r-xp 00000000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5 2af62f730000-2af62f930000 ---p 00068000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5 2af62f930000-2af62f931000 r--p 00068000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5 2af62f931000-2af62f932000 rw-p 00069000 00:26 867021370 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libintlc.so.5 2af62f932000-2af62f933000 rw-p 00000000 00:00 0 2af62f933000-2af62fa92000 r-xp 00000000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so 2af62fa92000-2af62fc92000 ---p 0015f000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so 2af62fc92000-2af62fc93000 r--p 0015f000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so 2af62fc93000-2af62fca6000 rw-p 00160000 00:26 867021443 /opt/application/ansys19/v192/tp/IntelCompiler/2017.6.256/linx64/lib/intel64/libirng.so 2af62fca6000-2af62fcbc000 r-xp 00000000 08:03 17044007 /usr/lib64/libpthread-2.17.so 2af62fcbc000-2af62febc000 ---p 00016000 08:03 17044007 /usr/lib64/libpthread-2.17.so 2af62febc000-2af62febd000 r--p 00016000 08:03 17044007 /usr/lib64/libpthread-2.17.so 2af62febd000-2af62febe000 rw-p 00017000 08:03 17044007 /usr/lib64/libpthread-2.17.so 2af62febe000-2af62fec2000 rw-p 00000000 00:00 0 2af62fec2000-2af62fee8000 r-xp 00000000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so 2af62fee8000-2af6300e7000 ---p 00026000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so 2af6300e7000-2af6300e8000 r--p 00025000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so 2af6300e8000-2af6300e9000 rw-p 00026000 00:26 425291467 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libCxHoops.so 2af6300e9000-2af630517000 r-xp 00000000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so 2af630517000-2af630717000 ---p 0042e000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so 2af630717000-2af630719000 r--p 0042e000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so 2af630719000-2af630733000 rw-p 00430000 00:26 425291458 /opt/application/ansys19/v192/fluent/fluent19.2.0/cortex/lnamd64/libStateEngine.so 2af630733000-2af630734000 rw-p 00000000 00:00 0 2af630734000-2af6307b9000 r-xp 00000000 00:26 425291459 |
|
July 14, 2022, 12:42 |
|
#2 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,750
Rep Power: 66 |
This is the libc version of a segmentation fault which means it tried to access memory and then couldn't. Frankly this could be caused by anything. Maybe someone unplugged your RAM or spilt coffee on it. Or maybe you have code that tries to access variables that haven't been declared yet.
|
|
July 14, 2022, 22:24 |
Hello, do you use the cluster, do you find the cluster is faster than the single pc?
|
#3 | |
Senior Member
Join Date: Dec 2017
Posts: 388
Rep Power: 9 |
Quote:
|
||
July 14, 2022, 22:46 |
|
#4 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,750
Rep Power: 66 |
Yes I use a cluster. And I employ just a tiny bit of common sense when I do. Most problems that I run on a cluster don't fit on one PC. So yes, it's infinitely faster.
|
|
July 14, 2022, 22:53 |
what is the reason that it does not fit on one pc? why is it faster than the pc?
|
#5 |
Senior Member
Join Date: Dec 2017
Posts: 388
Rep Power: 9 |
||
July 14, 2022, 23:04 |
|
#6 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,750
Rep Power: 66 |
Do you still have a Pentium CPU or do you have a modern multi-core CPU? Applications in general can run with more throughput via multithreading. Clusters are just massive versions of that.
I often need upwards of 200 GB of RAM to open my model. I only have 96 GB of RAM on my workstation. Even my smaller models that I can open on my workstation would take weeks to run. I run it on a cluster to get results on the same day. |
|
July 15, 2022, 04:51 |
Hi,my workstation has a CPU Xeon platinum 8273CL,what about your workstation and clu
|
#7 | |
Senior Member
Join Date: Dec 2017
Posts: 388
Rep Power: 9 |
Quote:
if the cluster and the workstation have the same hardware, does they have the same speed? |
||
July 15, 2022, 12:15 |
|
#8 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,750
Rep Power: 66 |
If you have two computers with the same hardware not running at the same speed then either one is defective or you really did something wrong.
But instead of asking about how others are not having issues how about provide details relevant to yourself that might elucidate the issues you are having? That would be more helpful to you. |
|
July 16, 2022, 23:29 |
sorry, Maybe I do not have declare the detail. The cluster has a CPU of E5-2605v4, an
|
#9 | |
Senior Member
Join Date: Dec 2017
Posts: 388
Rep Power: 9 |
Quote:
|
||
July 17, 2022, 02:45 |
|
#10 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,750
Rep Power: 66 |
An E5-2650V4 has 12 cores, 24 hyperthreaded. A 6226R has 16 cores.
If you use 24 cores on an E5-2650V4, you'll likely saturate the cpu and it runs at 100%. If you use 24 cores on a 6226R, it will be very suboptimal. The first block of 16 will do their work, and then the remaining 8 must wait for the first block of 16 to finish. Not only does this guarantee you have at least 25% idle time, it doubles the number of cpu cycles needed to complete 1 iteration. Since I can't really tell what else might be wrong, I recommend to run at less than capacity. I.e. 11 cores on both machines to do a fairer comparison. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
how to run fluent from matlab without using aas toolbox? | artemis96 | ANSYS | 7 | May 23, 2022 13:16 |
Why my calculated epsilon is different from FLUENT calculation? | minzhang | Fluent Multiphase | 14 | May 12, 2020 23:53 |
How to continue calculation after the computer is abruptly shut off in Ansys Fluent?? | rubeng0071 | FLUENT | 5 | February 9, 2020 16:16 |
Running UDF with Supercomputer | roi247 | FLUENT | 4 | October 15, 2015 14:41 |
Calculation stop with ansys fluent windows 8 | jb pouillard | FLUENT | 3 | September 22, 2015 04:35 |