|
[Sponsors] |
Mellanox ConnectX-3 Infiniband problem with Platform MPI 9.1.3 |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
August 22, 2016, 03:16 |
Mellanox ConnectX-3 Infiniband problem with Platform MPI 9.1.3
|
#1 |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Dears,
I have recently purchased two used Mellanox MCX353A-FCBT Infiniband VPI adapters. On Winodws 7 x64 / Windows Server 2012 R2 / Windows 10 x64 , I have installed latest driver and firmware from Mellanox website. Two cards are connected directly without switch and ofcourse by lunching opensm program, they work fine and I could ping each other with ibping. Problem starts when I want them in Platform MPI 9.1.3 for work with ANSYS Fluent 17.1 regardless of windows version being used. Unfortunately MPI does not detect card as it's forced by -IBAL switch. Only TCP mode works fine. The generate error is : (without -IBAL everything works fine but on TCP only ) C:\Users\Administrator>"%MPI_ROOT%\bin\mpirun" -mpi64 -IBAL -prot -netaddr 192.168.5.1 -hostlist ews15,ews17 c:\MPI\pp.exe mpirun: Drive is not a network mapped - using local drive. c:\MPI\pp.exe: Rank 0:1: MPI_Init: didn't find active interface/port c:\MPI\pp.exe: Rank 0:1: MPI_Init: Can't initialize RDMA device c:\MPI\pp.exe: Rank 0:1: MPI_Init: Internal Error: Cannot initialize RDMA protocol MPI Application rank 0 exited before MPI_Init() with status 1 c:\MPI\pp.exe: Rank 0:0: MPI_Init: didn't find active interface/port I have checked ibal.dll is located at Windows/system32 Any idea ? Thanks |
|
August 22, 2016, 11:05 |
|
#2 |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Dears,
As ghost82 said in another thread, WinOF 2.1.2 driver (which is no longer available) works fine and solved the problem. But this case is for Windows 7 only, doesn't applicable for Windows 10 or Server 2012 R2 Further more, the release date of driver is 2010/9 , why this happens ? Should IBM have updated it's MPI and it didn't ? So what are their customers do ? I'm confused. any Idea ? |
|
August 22, 2016, 12:56 |
|
#3 |
Senior Member
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,188
Rep Power: 23 |
I couldn't get my Connect-X cards working yet either, so I'm still using my infinihost-III cards, so I may not be too much help, but here are a few things to look at:
Did you cache your password for Platform MPI? run the command "%AWP_ROOT171%\commonfiles\MPI\Platform\9.1.3.1\Wi ndows\setpcmpipassword.bat" Have you configured the firewall correctly to let through the proper programs on you public network? |
|
August 22, 2016, 13:15 |
|
#4 |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Dear Erik,
There is no problem with parameters. The problem is behind Platform MPI and Mellanox driver compatibility. With WinOF 2.1.2 drive everything is OK on Windows 7 and ANSYS fluent speeds up to 210% with two nodes. But my question is that after 2010, Mellanox released more than 10 drivers. Why they are not working with Platform MPI any more ? |
|
August 26, 2016, 16:36 |
|
#5 |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Based on below link, IBM does not support WinOF later than 2.1
https://www.ibm.com/support/knowledg...platforms.html I don't know the reason behind this. Based on the link below, Intel MPI supports Mellanox WinOF 4.4 and Higher. Therefore there should be no problem for those who have ANSYS Fluent on Windows and Intel MPI beside new Mellanox Adapters. https://software.intel.com/sites/def...es-windows.pdf I have not tested it yet. Hope it works. I don't understand why IBM Platform MPI does not upgrade it's MPI for support Mellanox new adapters in Windows ? |
|
November 30, 2017, 18:21 |
|
#6 | |
Member
Join Date: Jul 2011
Posts: 53
Rep Power: 15 |
Quote:
It looks like Platform MPI still only supports WinOF 2.1. Did you get the Mellanox MCX353A cards working using Intel MPI? |
||
December 2, 2017, 11:50 |
|
#7 | |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Quote:
Intel MPI comes with ANSYS 18.2.2, does not work with Mellanox WinOF 5.35 on Windows 10. Also I have Installed Latest Intel MPI ( 2018 ) on windows but it acts as service and ANSYS uses it's old 5.1.3.180 version for execution. In Ansys Manual it is mentioned that Infiniband is only supported with MS-MPI on Windows Platform but there is no MS-MPI folder in "C:\Program Files\ANSYS Inc\v182\fluent\fluent18.2.0\multiport\mpi\win64" which leads to error when MS-MPI is selected. ( Of course I have installed Latest MS-MPI from Microsoft as service, but ansys executive files are not existed). The only successful way is to use Windows 7 with Win-OF 2.1 Driver or Migrate to Linux. |
||
December 2, 2017, 12:54 |
|
#8 | |
Member
Join Date: Jul 2011
Posts: 53
Rep Power: 15 |
Quote:
According to the release notes of Intel MPI 5.1, Mellanox WinOF Rev 4.4 or higher is supported. Interestingly, the release notes mention compatibility with Windows Server, Windows 7, Windows 8, and Windows 8.1. It does not include Windows 10, but I wouldn't imagine it should be a problem...? https://jp.xlsoft.com/documents/inte...es-windows.pdf |
||
December 2, 2017, 13:13 |
|
#9 | |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Quote:
[0] MPI startup(): dapl fabric is not available and fallback fabric is not enabled I have 2 ideas: 1- While the latest Intel MPI is installed on system (2018 Update 1 ) on two systems, a helloworld application should be tested by command line with dapl forced fabric. My guess is that Ansys is compiled with old Intel MPI and if this test works fine, we should wait for upcoming Ansys to solve this. Make sure that even you install the latest Intel MPI, using setimpipassword.bat file for cashing the windows credential will lead to change the MPI Server location to Ansys embedded files which is 5.1.3.180. be careful about this. 2- As I asked Mellanox before regarding the supported MPI Platform (https://community.mellanox.com/thread/3402) they stated that MS-MPI is only supported. Idea 1 should be tested with MS-MPI and if it works fine, then ms-mpi files should be manually copied to the mentioned location and lunch Ansys again. it might work. C:\Program Files\ANSYS Inc\v182\fluent\fluent18.2.0\multiport\mpi\win64\m s\Bin If you have any idea, kindly let me know. |
||
December 3, 2017, 11:48 |
|
#10 |
Member
Join Date: Jul 2011
Posts: 53
Rep Power: 15 |
http://www.ansys.com/-/media/ansys/c...-182.pdf?la=en
According to that document, CFX/Fluent under Windows 10 supports Infiniband with Intel MPI 5.1.3. In the same document MS MPI is listed as only supported with Windows Server 2012. So do we know for a fact that Mellanox WinOF 5.35 (latest version) does not support Intel MPI 5.1.3? Have you contacted Ansys support? |
|
December 4, 2017, 05:53 |
|
#11 |
Member
Join Date: Jul 2011
Posts: 53
Rep Power: 15 |
Have you tried to use either IBM Platform MPI 9.1.4 or Intel MPI 5.1.3 together with the OFED software from OpenFabrics.org (instead of using the Mellanox WinOF software)? The latest version is OFED 3.2
https://www.openfabrics.org/index.php/ofs-windows.html |
|
December 8, 2017, 04:10 |
|
#12 | |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Quote:
On Windows 10, I have installed WinOF 4.95 till WinOF 5.35 on both nodes. A Diagnostic tools from intel (IMB-RMA.exe) has been used With Intel MPI 2018.1 for each driver set. none of them could work with -DAPL switch. it means Intel MPI dose not support WinOF 4.95 and newer versions on Windows 10. MS-MPI 8.1.1 was successfully tested with WinOF 5.35 in Direct Connection mode. I could not integrate it with ANSYS Fluent yet but MS-MPI seems to be the only supported MPI for Windows 10 with Latest Mellanox Drivers. |
||
January 22, 2019, 17:31 |
|
#13 | |
New Member
Allen
Join Date: Dec 2018
Posts: 4
Rep Power: 7 |
Should I launch opensm program on both pcs, or one will be just fine? Thank you!
Quote:
|
||
January 23, 2019, 15:28 |
|
#14 |
New Member
M-G
Join Date: Apr 2016
Posts: 28
Rep Power: 10 |
Just one is enough.
|
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[OpenFOAM.org] MPI compiling and version mismatch | pki | OpenFOAM Installation | 7 | June 15, 2015 17:21 |
Platform MPI and HP MPI installation? | Whitebear | CFX | 0 | February 18, 2013 03:57 |
Sgimpi | pere | OpenFOAM | 27 | September 24, 2011 08:57 |
Error using LaunderGibsonRSTM on SGI ALTIX 4700 | jaswi | OpenFOAM | 2 | April 29, 2008 11:54 |
Is Testsuite on the way or not | lakeat | OpenFOAM Installation | 6 | April 28, 2008 12:12 |