CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > Siemens > STAR-CCM+

Running on 4 node error MPI Errors[320798736]

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   April 1, 2020, 04:34
Default Running on 4 node error MPI Errors[320798736]
  #1
New Member
 
Join Date: Sep 2012
Posts: 26
Rep Power: 14
tailele is on a distinguished road
Good morning boys , i have a question. Until this moment i used 3 nodes from 24 cores (v3) for node; and all goes well. Yesterday i buyed another node , in this case is a double xeon (v4). In this moment only one cpu function because the second is without cooling. As Always i started my system with the new node , but as i write in the object i return an error MPI.

The log is this:
SEVERE [star.base.neo.ClientNotifyHandler]: MPI Errors[320798736] : MPI_Waitall: Internal MPI error: Filename too long

MPI Errors[320798736] : MPI_Waitall: Internal MPI error: Filename too long

Now i try with some test:
Node 1 , 2 and 3 are the old nodes ; node 4 is the newone

Test 1:
Node 1,2,3,4 with only a core for node --test passed
Node 1,2,3,4 with 16 cores for node -- test passed
Node 1,2,4 with 24 core for node 1 and 2 and 16 cores for node 4 --test passed
Node 1,2,3 with 24 cores for nodes and node 4 with 16 cores -- test failed
Node 1,2,3 with 24 cores for nodes and node 4 with 1 cores -- test failed

Anyone have a idea?
tailele is offline   Reply With Quote

Old   April 1, 2020, 12:49
Default
  #2
Senior Member
 
Sebastian Engel
Join Date: Jun 2011
Location: Germany
Posts: 567
Rep Power: 21
bluebase will become famous soon enough
Is your cluster running a windows system by chance?
bluebase is offline   Reply With Quote

Old   April 1, 2020, 13:14
Default
  #3
New Member
 
Join Date: Sep 2012
Posts: 26
Rep Power: 14
tailele is on a distinguished road
Quote:
Originally Posted by bluebase View Post
Is your cluster running a windows system by chance?
Yes, Windows 10....
tailele is offline   Reply With Quote

Old   April 1, 2020, 14:03
Default
  #4
New Member
 
Join Date: Sep 2012
Posts: 26
Rep Power: 14
tailele is on a distinguished road
i chaged the mpi from ibm to intel. The error is Always present but isn't the same:

SEVERE [star.base.neo.ClientNotifyHandler]: Connection reset
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream. java:210)
at java.net.SocketInputStream.read(SocketInputStream. java:141)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.j ava:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.ja va:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:1 78)
at java.io.InputStreamReader.read(InputStreamReader.j ava:184)
at java.io.BufferedReader.fill(BufferedReader.java:16 1)
at java.io.BufferedReader.read(BufferedReader.java:18 2)
at star.base.neo.NeoProperty.readNextChar(NeoProperty .java:804)
at star.base.neo.NeoProperty.input(NeoProperty.java:5 83)
[catch] at star.base.neo.ClientNotifyHandler.run(ClientNotify Handler.java:427)
at java.lang.Thread.run(Thread.java:748)
WARNING [org.netbeans.core.multiview.MultiViewTopComponent]: The MultiviewDescription instance class star.coremodule.ui.SimulationMultiviewDesc is not serializable. Cannot persist TopComponent.
WARNING [null]: Last record repeated again.
tailele is offline   Reply With Quote

Old   April 2, 2020, 05:59
Default
  #5
New Member
 
Join Date: Sep 2012
Posts: 26
Rep Power: 14
tailele is on a distinguished road
Now i add the second cpu , so mi situation is 4 nodes with two cpu for node:
Node 1: 2x e5 2658v3 (24 cores)
Node 2: 2x e5 2658v3 (24 cores)
Node 3: 2x e5 2658v3 (24 cores)
Node 4: 2x e5 2683v4 (32 cores)

Today i started a new session , with two tests:

-1 : 4 nodes with 24 cores for node ---passed
-2 : 3 nodes with 24 cores for nodes and node 4 with 32 cores --- failed

Is it possible that all node must have the same cores?
But how with 3 nodes i could use 2 node with 24 cores and one with 16 cores?
tailele is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
128 core cluster E5-26xx V4 processor choice for Ansys FLUENT F1aerofan Hardware 30 January 19, 2018 04:53
mpirun, best parameters pablodecastillo Hardware 18 November 10, 2016 13:36
running without rsh between nodes hattonps OpenFOAM 10 March 22, 2010 16:02
Kubuntu uses dash breaks All scripts in tutorials platopus OpenFOAM Bugs 8 April 15, 2008 08:52
Running MPI code on a multiprocessor node wen Main CFD Forum 7 April 19, 2006 17:35


All times are GMT -4. The time now is 22:00.