|
[Sponsors] |
Fluent Jobs failed to start on Linux Cluster with PBS |
|
LinkBack | Thread Tools | Search this Thread | Display Modes |
December 19, 2019, 13:03 |
Fluent Jobs failed to start on Linux Cluster with PBS
|
#1 |
New Member
S. Schneider
Join Date: Sep 2019
Posts: 2
Rep Power: 0 |
Hey there,
I'm currently writing my master's thesis using Fluent and I've come across the following problem: I would like to run my Fluent simulation from a Windows client via RSM on a remote Linux cluster. The connection between client and cluster is established via a VPN tunnel. Client and cluster access two different license servers. The client has access to a local license server, even if there is no VPN connection. The same applies to the cluster. On the cluster, PBS Pro is used as the Job Management System. The setup of RSM on the head node (Athena) and the client was successful (test successful, data transfer successful, ...). If I want to start a simulation from the Workbench (Solution Settings: Submit to Remote Solve Manager, RSM Queue selected, ...) the job is copied to the staging directory and added to the PBS queue (Assign Job ID, Status Running), but when I connect with the assigned Execution Nodes and take a look at the CPU load and processes, no Fluent process is started... Unfortunately, I did not come up with a useful solution while checking the log data. Only these entries with the job ID were found in the log directory of PBS: Path: /var/spool/pbs/sched_logs/20191219In the staging directory I also did not find any information in the files. Do any of you have any idea why the Fluent process is not running? Where can I find more log data that could help me? Greetings Steffen |
|
December 19, 2019, 18:02 |
|
#2 |
Senior Member
Svetlana Tkachenko
Join Date: Oct 2013
Location: Australia, Sydney
Posts: 416
Rep Power: 15 |
I didn't do such installation before but here are a few thoughts about what to check:
- Is rsm running on the execution node? - If it is, does restarting it help with this issue? - If it is running and restarting it does not help. Can you telnet to the execute node on port 9192. Does the connection succeed? If not, what error does it output? - Does it work if you try to run ansys cfx instead of fluent in the case you have it installed? (You do not have to create a cfx case for this - just get it to start with a non-existing file name and it should complain that the file does not exist.) |
|
December 21, 2019, 09:26 |
|
#3 |
New Member
S. Schneider
Join Date: Sep 2019
Posts: 2
Rep Power: 0 |
Thanks for your response,
first i thought of course RSM runs on the execution nodes, but then i checked the service again. And as it seems the RSM service only runs on the head node, but not on the execution nodes... I really think that this might be the problem. Unfortunately the network administrator is on vacation until the beginning of next year and I cannot start the service myself... But as soon as I have news, I will let you know! |
|
January 24, 2020, 00:17 |
|
#4 |
Senior Member
Svetlana Tkachenko
Join Date: Oct 2013
Location: Australia, Sydney
Posts: 416
Rep Power: 15 |
Any news here?
|
|
Tags |
linux cluster, pbs, rsm |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Rocks and linux for fluent cluster | Far | FLUENT | 2 | March 2, 2012 10:31 |
script file for running fluent on linux cluster | Worth | FLUENT | 2 | February 9, 2012 12:31 |
Linux cluster for Fluent 6.3.26 | Far | Hardware | 2 | April 5, 2009 08:36 |
Fluent on linux cluster | Far | FLUENT | 0 | April 3, 2009 11:03 |
Fluent 6.3 on a Linux cluster | cfd_newbie | FLUENT | 4 | February 8, 2008 12:09 |