|
[Sponsors] |
January 6, 2019, 08:31 |
fluent issue in windows hpc pack 2012 r2
|
#1 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
hello
i have created a HPC cluster using microsoft hpc pack 2012r2 and i,m trying to run fluent. i followed the post installation task described in this link: http://www.sharcnet.ca/Software/Ansy..._setupdan.html the cluster has 2 nodes (test cluster). when i try to run fluent in parallel mode i have a warning. (added pics) i also added pics from my fluent launcher config. the issue is that the job fails after a little while and fluent GUI is stuck. i have ansys 18.2 installed on both nodes any ideas what i,m doing wrong? update: if i attempt to run it on one node and choose the head node it works, so the issue has to be my compute node also i noticed this warning: FLUENT_INC=C:/PROGRA~1/ANSYSI~1/v182/fluent is not a shared directory update 2: attempting to run only on compute node also succeeds. when i try to run on both machines it does not fail but the GUI is stuck. fluent console output: Host spawning Node 0 on machine "WIN-6CQO3PDKEA1" (win64). *** *** FLUENT_INC=C:/PROGRA~1/ANSYSI~1/v182/fluent is not a shared directory! Fluent may not work properly in parallel across a network! *** Job has been submitted. ID: 3013. Waiting for CCP scheduler@WIN-6CQO3PDKEA1 to start msmpi nodes ... Job 3013 is Running. Last edited by mosesHPC; January 6, 2019 at 10:58. |
|
January 6, 2019, 16:29 |
|
#3 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
tnx for the reply
i have already read this thread. my goal is to set it up using microsoft HPC since we will reuse the cluster for other applications later on. also my cluster has AMD processors and this thread uses intel MPI. |
|
January 7, 2019, 08:39 |
|
#4 | |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
Quote:
when installing intel mpi i got a warning that my CPUs dont have intel architecture (obviously). but it installed anyway. when i run ansys in parallel mode this error occures. "cannot connect from [WINDOWS-HOSTNAME] to IP ADDRESS" the error appears to be from mpiexec.exe. i think the issue is my CPU. any ideas? |
||
January 7, 2019, 12:21 |
|
#5 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,751
Rep Power: 66 |
You don't have to use the intel mpi if you don't want to or can't. But the important point is that your workers need to be able to connect to the host machine and vice-versa.
|
|
January 7, 2019, 13:10 |
|
#6 | |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
Quote:
my nodes can ping the host and have access to shared directory and i have disabled firewalls on all systems. i did a bit more digging and looks like msmpi.exe cannot access the service on the main node or any other node. looks like a certain service is not running or refusing connection. but i couldn't find the service that the error message mentioned on any nodes. is it possible to get guides on running fluent on other clusters like rocks or openHPC or even running it on linux as distributed? i appreciate the help |
||
January 7, 2019, 15:37 |
|
#7 | |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,751
Rep Power: 66 |
I'm assuming you have already entered you windows username and password at some point into the dialogue box.
And you no longer get this warning? Otherwise that means they are not being found! Quote:
From the head node, can you launch Fluent running only on the compute node? From the compute node, can you launch Fluent running only on the head node? I'm guessing neither of these will work. Wildcard attempt at a fix. Try selecting default for the mpi type instead of choosing msmpi. Last edited by LuckyTran; January 7, 2019 at 19:18. |
||
January 7, 2019, 18:45 |
|
#8 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
i can launch fluent on head node alone.
also i can launch from compute node alone (bring down head node and launch with one node which results in the job being assigned to compute node) but when i launch with both it just hangs. it asks for username and password after launch which i entered. if i don't use the HPC cluster. how can i run distributed ansys on systems with AMD processors(ryzen type)? |
|
January 7, 2019, 19:21 |
|
#9 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,751
Rep Power: 66 |
The fact that you have AMD processors is not important. Intel MPI should have also run on your system(s) but I can see why you would want to run it on msmpi to use the job scheduler. It's purely a MPI installation problem and whether or not each machine has an open connection to the others and whether they have sufficient privileges to access the directory for the Fluent binaries.
|
|
January 8, 2019, 01:48 |
|
#10 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
i have added console output from when i run fluent on different nodes.
when i run on both nodes it seems a bit odd. when choosing both nodes it hangs. i don't know why i,m getting the error that fluent directory is not shared even though i have shared it.(pic added) any ideas what is causing it? could it be a licensing issue? |
|
January 8, 2019, 11:03 |
|
#11 | |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,751
Rep Power: 66 |
You have a very common issue, the solution is not always apparent.
So I see you have shared some directory. Quote:
Daniele also mentioned some interesting things needed to be done to access the Fluent binaries. |
||
January 8, 2019, 11:29 |
|
#12 | |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
Quote:
i tried this part. if i open RUN and type \\IP_ADDRESS_OF_NODE . it will open the shared directory and it works from both sides, however if i open the network segment from my computer it will not show the other node(works with the command but not the GUI). maybe that is the problem. how can i resolve this? as for the firewall i have added a rule accepting all inbound connections on all nodes and since i,m running windows server 2012r2 i don't have windows defender to stop anything. |
||
January 8, 2019, 12:48 |
|
#13 |
Senior Member
Lucky
Join Date: Apr 2011
Location: Orlando, FL USA
Posts: 5,751
Rep Power: 66 |
Using the trick by going to run and using the \\address\C$ almost always works because it is a hidden administrative share. All you need is administrative privileges for this to work (i.e. be logged into an admin account with the correct username and password) and have certain remote services enabled.
If you can do everything in run using \\address\blah\blah\blah but not directly access them in network explorer then... Unfortunately there's an entire list of services involved and any one of them can cause and error (ignore that it is a Win10 issue because I've run into it all the way back to XP). My guess is, if you try to use these, they will also not work. |
|
January 8, 2019, 13:21 |
|
#14 | |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
Quote:
i did try to run it with these and you are right they do not work i will post a solution here if i find one. i,m going to reinstall my cluster tomorrow. thank you for the help |
||
January 9, 2019, 01:18 |
|
#15 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
i enabled some services and the systems can see each other in the network and access files but the problem persists.
the services i enabled were these: SSDP Discovery Function Discovery Resource Publication UPnP Device Host neither HPC pack nor Distributed mode works for me. Last edited by mosesHPC; January 9, 2019 at 02:26. |
|
January 9, 2019, 05:51 |
|
#16 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
i managed to get it working with distributed ansys and intel mpi.
when i fluent with both machines it works fine. i opened a project using all the cores but it's giving me an error: Ansys.Fluent.Cortex.CortexNotAvailableException: Exception of type 'Ansys.Fluent.Cortex.CortexNotAvailableException' was thrown. at Ansys.Fluent.Data.SetupData.GetCommunicator(IReadL ockContainer context) at Ansys.Fluent.Data.SetupData.ReadCaseModelInfo(IFul lContext context) at Ansys.Fluent.Data.SetupData.ReadMeshAndModelInfo(I FullContext context) at Ansys.Fluent.Data.SetupData.LoadFiles(IFullContext context) at Ansys.Fluent.Commands.EditCommand.Execute(IFullCon text context) at Ansys.Core.Commands.Concurrency.CommandWorkUnit.ex ecuteInContext(CommandContext subContext, IExecutionEngineCallback tracer) at Ansys.Core.Commands.Concurrency.BaseWorkUnit.doExe cute(IExecutionEngineCallback executionEngine, CommandContext subContext) at Ansys.Core.Commands.Concurrency.BaseWorkUnit.Execu te(IExecutionEngineCallback executionEngine, Boolean dontCatchExceptions) --- Ansys.Core.Commands.CommandFailedException: Exception of type 'Ansys.Fluent.Cortex.CortexNotAvailableException' was thrown. CommandName: Fluent.Edit(Container="Setup") at Ansys.Core.Commands.CommandAsyncResult.Wait(Int32 milliSecondsTimeout, Boolean exitContext) at Ansys.Fluent.Commands.EditCommand.InvokeAndWait(IP rotectedContext context, DataContainerReference Container, Boolean Interactive) at Ansys.Fluent.Gui.OpenInFluentGui.Invoke(GuiOperati onContext context) at Ansys.UI.GuiOperationContext.Invoke(GuiOperationMe taData operationData) at Ansys.UI.UIManager.InvokeOperationCore(String pseudoname, OperationDelegate callback, Boolean allowOSMessages, Boolean coreTransaction) is this an issue with the project itself or my configuration? the project loads fine when i open it on a single machine |
|
January 9, 2019, 07:11 |
|
#17 | |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
Quote:
looks like one of the systems is running low on RAM (only has 8GB). i can run the project with 15 cores but not 16. |
||
January 12, 2019, 16:01 |
|
#18 |
New Member
moses
Join Date: Jan 2019
Posts: 12
Rep Power: 7 |
ok so i added some RAM and also some other hardware.
but noticed that i can only use up to 8 processes on extra machines otherwise i run into this error. i will try to break those machines into multiple virtual machines to see whether or not i can utilize their full potential. |
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Fluent - Linux vs Windows | derick | FLUENT | 2 | August 16, 2020 12:23 |
problems with Fluent display windows | chris | FLUENT | 3 | January 7, 2016 11:44 |
Fluent in Linux vs. Fluent in Windows | Melih | FLUENT | 6 | November 16, 2014 10:39 |
Microsoft HPC Pack 2008 Tool Pack (LINPACK) | jemyungcha | Hardware | 1 | October 22, 2011 19:21 |
CFX11 + Fortran compiler ? | Mohan | CFX | 20 | March 30, 2011 19:56 |