英文:
Confused about SLURM: I SSH to a compute node with a private key, so how SLURM is able to access a compute node if I just add a name to slurm.conf?
问题
I'm curious about understanding how slurmctld access to the compute nodes or send info to them. I'm setting up SLURM so is not fully functional yet.
Currently I SSH to the compute node with a private key, i.e., ssh -i mykey.pem compute-node01. To set SLURM, I just added the compute node name to slurm.conf (via slurmd -C). Then, I copied the munge.key and the slurm.conf to all nodes so they are the same.
Currently, it is not working. I get munge credential not recognized. I wonder if it is because everytime I access to a node I must type ssh -i mykey.pem compute-node0X, i.e., I must use a private key to access to each node...
I have the following questions:
- how does SLURM get access to the other nodes? I never registered any IP anywhere, I just added the node name using
slurmd -Ctoslurm.conf, which according to me, doesn't say anything relevant to have a real connection. Is it because they share themunge.keyand within this key there is some sort ofssh -i privatekey IPconnection? - Is my SSH access with a key blocking SLURM and that's why I get credential not recognized?
Thanks
英文:
I'm curious about understanding how slurmctld access to the compute nodes or send info to them. I'm setting up SLURM so is not fully functional yet.
Currently I SSH to the compute node with a private key, i.e, ssh -i mykey.pem compute-node01. To set SLURM, I just added the compute node name to slurm.conf (via slurmd -C). Then, I copied the munge.key and the slurm.conf to all nodes so they are the same.
Currently, it is not working. I get munge credential not recognized. I wonder if it is because everytime I access to a node I must type ssh -i mykey.pem compute-node0X, i.e., I must use a private key to access to each node...
I have the following questions:
- how does SLURM get access to the other nodes? I never registered any IP anywhere, I just added the node name using
slurmd -Ctoslurm.conf, which according to me, doesn't say anything relevant to have a real connection. Is it because they share themunge.keyand within this key there is some sort ofssh -i privatekey IPconnection? - Is my SSH access with a key blocking SLURM and that's why I get credential not recognized?
Thanks
答案1
得分: 0
Slurm不使用SSH进行通信。一旦munge守护程序和slurmd守护程序启动运行,slurmd守护程序通过Slurm特定的端口与slurmctld守护程序通信,前提是:
- 它们共享相同的munge密钥
- 运行
slurmd的服务器在运行slurmctld的服务器的slurm.conf中注册。 - 防火墙不阻止在slurm端口(6818)上的通信。
mykey.pem SSH密钥与您的用户帐户相关,与Slurm无关。
英文:
Slurm does not use SSH to communicate. Once the munge daemon and the slurmd daemon are up and running, the slurmd daemons communicate with the slurmctld daemon through Slurm-specific ports provided that
- they share the same munge key
- the server running
slurmdis registered in theslurm.confof the server runningslurmctld. - the firewall does not block communications on slurm port (6818)
The mykey.pem SSH key is related to your user account, not to Slurm.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论