CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Forums > Software User Forums > ANSYS

Setup Ansys RSM with ARC on Linux cluster

Register Blogs Community New Posts Updated Threads Search

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old   July 27, 2022, 10:01
Default Setup Ansys RSM with ARC on Linux cluster
  #1
Senior Member
 
Maxim
Join Date: Aug 2015
Location: Germany
Posts: 413
Rep Power: 13
-Maxim- is on a distinguished road
Hello,

I hope this is the right section of the forum, if not, please move my post. Thanks.

I am trying to set up Ansys RSM with ARC (Ansys Remote Cluster) on my little linux cluster running CentOS and Ansys 2022 R2. The cluster consists of a master with 2 nodes (16 cores each), connected via infiniband.

So far I have copied over my .def files, started the calculation via putty from my Windows workstation and copied the results back. Now I would like to use the optimization features of Workbench and for that I need to set up RSM to send the jobs to the cluster from the Workbench.

What I've done so far is the following (all as root on master):
- ./rsmclusterconfig to create service scripts rsmlauncher, arcmaster and arcnode
- ./rsmconfig -launcher --> successfully installed and launched --> RSM Launcher Service is running
- ./arcmaster start --> succesful
- ./arcnode start --> successful

Then I launched the RSM config tool on my windows workstation, defined name of cluster, staging directory on network share and on cluster, stored user credentials, imported and refreshed the queues.

I have two ARC queues now: Default and local which would start the test job on the master. When I test either of them now, the test runs now successful. (*edited* - no idea why it works now...)

However, I am not able to set up my compute nodes in ./arcconfigui unfortunately. When I want to add the node01 (that's the name of my node), it states the following in putty:
Code:
/usr/ansys_inc/v222/RSM/ARC/tools/linx64/arcdeploy -f "/root/.ansys/v222/deploy.csv" -nomaster
You entered the following 3 command line arguments:
-f
/root/.ansys/v222/deploy.csv
-nomaster
Starting ARC service on node01
I am running the following commands
ssh -tt node01 "sudo -E /usr/ansys_inc/v222/RSM/ARC/tools/linx64/install_daemon -arcnode"
Installing Ansys RSM Cluster Node service
arcnode222 installed.
Please use /etc/init.d/arcnode222 {start|stop|restart|status|try-restart|condrestart} to monitor arcnode222 service
Setting environment variables from '/usr/ansys_inc/v222/RSM/ARC/tools/linx64/arc_env_profile'
Ansys RSM Cluster Node Service is not running
Setting environment variables from '/usr/ansys_inc/v222/RSM/ARC/tools/linx64/arc_env_profile'
Starting Ansys RSM Cluster Node Service ...
Starting /usr/ansys_inc/v222/Framework/bin/Linux64/runwb2 -cmd mono /usr/ansys_inc/v222/RSM/ARC/bin/ArcNodeService.exe -noshutdown...
Ansys RSM Cluster Node Service start failed.  Please check the Logfile for Errors
mkdir: cannot create directory ???/home/rsmadmin???: Permission denied
 This process is not being run as an Administrator.
An Unhandled Exception has occurred. Access to the path '/home/rsmadmin/.ansys/v222/ARC' is denied.
I am running that as root, therefore I should be Administrator. The user "rsmadmin" seems to be auto-created (/home/rsmadmin exists). There is no firewall active.


Any pointers would be helpful. Thank you for your help!
-Maxim

Last edited by -Maxim-; July 27, 2022 at 11:01.
-Maxim- is offline   Reply With Quote

Old   July 28, 2022, 11:25
Default
  #2
Senior Member
 
Erik
Join Date: Feb 2011
Location: Earth (Land portion)
Posts: 1,186
Rep Power: 23
evcelica is on a distinguished road
Have you contacted ANSYS support? They have been very helpful in the past for issues like this.
evcelica is offline   Reply With Quote

Old   July 29, 2022, 04:36
Default
  #3
Senior Member
 
Maxim
Join Date: Aug 2015
Location: Germany
Posts: 413
Rep Power: 13
-Maxim- is on a distinguished road
Thank you evcelica for your answer.

I have contacted my local Ansys guy about setting up RSM and he said that this will probably take a day (incl. setting up PBS queueing and potentially updating the OS) and gets billed separately and not within our normal support contract. That's why I wanted to see if I can set it up myself with ARC and test if that's enough for my purposes.
I will probably just contact him again with that specific question.
-Maxim- is offline   Reply With Quote

Old   September 1, 2022, 04:13
Default
  #4
Senior Member
 
Maxim
Join Date: Aug 2015
Location: Germany
Posts: 413
Rep Power: 13
-Maxim- is on a distinguished road
quick update:
with the help of the Ansys support we figured out that ARC created the user rsmadmin on the master and the nodes separately so that those users had separate user IDs. Therefore, the rsmadmin user from the master didn't have rights to access the home dir of the rsmadmin on the node.
We fixed that by creating the rsmadmin user manually with a fixed user ID on both the master and the nodes.


BUT: I still can't deploy the ARC node service from the arcconfig to the cluster:
Code:
  Database Directory is: /home/rsmadmin/.ansys/v222/ARC
  The UpdateMasterLoad method failed to update execution node load table to master: node01.cluster:13222.
  Please check that the master service is started and that there is no firewall blocking access on ports 11222, 12222, or $
There is no firewall active on the nodes.



Ansys support is out of ideas. Their suggestion was now to reinstall the whole server to a newer CentOS, newer Infiniband drivers, etc.
Maybe you guys have another idea first? Thanks.

Last edited by -Maxim-; September 1, 2022 at 06:45.
-Maxim- is offline   Reply With Quote

Old   September 1, 2022, 06:45
Default
  #5
Senior Member
 
Maxim
Join Date: Aug 2015
Location: Germany
Posts: 413
Rep Power: 13
-Maxim- is on a distinguished road
quick update:
with the help of the Ansys support we figured out that ARC created the user rsmadmin on the master and the nodes separately so that those users had separate user IDs. Therefore, the rsmadmin user from the master didn't have rights to access the home dir of the rsmadmin on the node.
We fixed that by creating the rsmadmin user manually with a fixed user ID on both the master and the nodes.


BUT: I still can't deploy the ARC node service from the arcconfig to the nodes:
Code:
  Database Directory is: /home/rsmadmin/.ansys/v222/ARC
  The UpdateMasterLoad method failed to update execution node load table to master: node01.cluster:13222.
  Please check that the master service is started and that there is no firewall blocking access on ports 11222, 12222, or $
There is no firewall active on the cluster.



Ansys support is out of ideas. Their suggestion was now to reinstall the whole server to a newer CentOS, newer Infiniband drivers, etc.
Maybe you guys have another idea first? Thanks.
-Maxim- is offline   Reply With Quote

Old   March 14, 2023, 05:03
Default
  #6
New Member
 
Amin
Join Date: Mar 2023
Posts: 2
Rep Power: 0
ams_sharif is on a distinguished road
Hi,
I know I am 6 months late, but I have just stumbled into the same error
The fix for me was quite easy though


/ansys_inc/v*/RSM/ARC/tools/linx64/install_daemon -arcmaster
ams_sharif is offline   Reply With Quote

Old   March 22, 2023, 09:17
Default
  #7
Senior Member
 
Maxim
Join Date: Aug 2015
Location: Germany
Posts: 413
Rep Power: 13
-Maxim- is on a distinguished road
Quote:
Originally Posted by ams_sharif View Post
Hi,
I know I am 6 months late, but I have just stumbled into the same error
The fix for me was quite easy though

/ansys_inc/v*/RSM/ARC/tools/linx64/install_daemon -arcmaster

Hi ams_sharif,
thanks for taking the time to sign up and responding to my post. However, I am not quite sure if that will fix my problem too. The arcmaster daemon is already installed on my HPC head/master node and sending jobs to the master works well. The problem is on the calculation nodes where I should only install the arcnode daemon.


Do you also have a HPC cluster setup and can send jobs to each individual node or do you only send jobs to the head/master?
-Maxim- is offline   Reply With Quote

Old   March 29, 2023, 09:14
Default
  #8
New Member
 
Amin
Join Date: Mar 2023
Posts: 2
Rep Power: 0
ams_sharif is on a distinguished road
Oh I see what you are trying to do. I'm not sure if this is possible. As far as I'm aware, every node should have one master only.
I submit all my jobs through Slurm to the master node and it takes care of the process. The PCs I use are a bit old so I allocate all of the processors to a single job and the remaining jobs would be queued.

Have you tried running arcconfigui on a 2nd node? Choose it to be a master, add the nodes and see if both configs keep the nodes under them. If yes then you are good to go. You would just need a job handler/scheduler to handle your submitted jobs among both configs.
I'm currently away from the cluster. I might try it out next week and brief you with my findings.
ams_sharif is offline   Reply With Quote

Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
[blockMesh] Meshing a circle in a square ChasingNeutrons OpenFOAM Meshing & Mesh Conversion 6 October 11, 2016 21:42
Can you help me with a problem in ansys static structural solver? sourabh.porwal Structural Mechanics 0 March 27, 2016 18:07
Ansys SIG$ILL error loth ANSYS 3 December 24, 2015 06:31
2-way FSI in Ansys CFX 15 LucasGasparino CFX 3 August 6, 2015 04:17
Linux Cluster Setup Problems Bob CFX 1 October 3, 2002 19:08


All times are GMT -4. The time now is 07:28.