英文:
Confused about SLURM: I SSH to a compute node with a private key, so how SLURM is able to access a compute node if I just add a name to slurm.conf?
问题
I'm curious about understanding how slurmctld access to the compute nodes or send info to them. I'm setting up SLURM so is not fully functional yet.
Currently I SSH to the compute node with a private key, i.e., ssh -i mykey.pem compute-node01
. To set SLURM, I just added the compute node name to slurm.conf (via slurmd -C
). Then, I copied the munge.key and the slurm.conf to all nodes so they are the same.
Currently, it is not working. I get munge credential not recognized. I wonder if it is because everytime I access to a node I must type ssh -i mykey.pem compute-node0X
, i.e., I must use a private key to access to each node...
I have the following questions:
- how does SLURM get access to the other nodes? I never registered any IP anywhere, I just added the node name using
slurmd -C
toslurm.conf
, which according to me, doesn't say anything relevant to have a real connection. Is it because they share themunge.key
and within this key there is some sort ofssh -i privatekey IP
connection? - Is my SSH access with a key blocking SLURM and that's why I get credential not recognized?
Thanks
英文:
I'm curious about understanding how slurmctld access to the compute nodes or send info to them. I'm setting up SLURM so is not fully functional yet.
Currently I SSH to the compute node with a private key, i.e, ssh -i mykey.pem compute-node01
. To set SLURM, I just added the compute node name to slurm.conf (via slurmd -C
). Then, I copied the munge.key and the slurm.conf to all nodes so they are the same.
Currently, it is not working. I get munge credential not recognized. I wonder if it is because everytime I access to a node I must type ssh -i mykey.pem compute-node0X
, i.e., I must use a private key to access to each node...
I have the following questions:
- how does SLURM get access to the other nodes? I never registered any IP anywhere, I just added the node name using
slurmd -C
toslurm.conf
, which according to me, doesn't say anything relevant to have a real connection. Is it because they share themunge.key
and within this key there is some sort ofssh -i privatekey IP
connection? - Is my SSH access with a key blocking SLURM and that's why I get credential not recognized?
Thanks
答案1
得分: 0
Slurm不使用SSH进行通信。一旦munge
守护程序和slurmd
守护程序启动运行,slurmd
守护程序通过Slurm特定的端口与slurmctld
守护程序通信,前提是:
- 它们共享相同的munge密钥
- 运行
slurmd
的服务器在运行slurmctld
的服务器的slurm.conf
中注册。 - 防火墙不阻止在slurm端口(6818)上的通信。
mykey.pem
SSH密钥与您的用户帐户相关,与Slurm无关。
英文:
Slurm does not use SSH to communicate. Once the munge
daemon and the slurmd
daemon are up and running, the slurmd
daemons communicate with the slurmctld
daemon through Slurm-specific ports provided that
- they share the same munge key
- the server running
slurmd
is registered in theslurm.conf
of the server runningslurmctld
. - the firewall does not block communications on slurm port (6818)
The mykey.pem
SSH key is related to your user account, not to Slurm.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论