Confused about SLURM: I SSH to a compute node with a private key, so how SLURM is able to access a compute node if I just add a name to slurm.conf?

huangapple go评论63阅读模式
英文:

Confused about SLURM: I SSH to a compute node with a private key, so how SLURM is able to access a compute node if I just add a name to slurm.conf?

问题

I'm curious about understanding how slurmctld access to the compute nodes or send info to them. I'm setting up SLURM so is not fully functional yet.

Currently I SSH to the compute node with a private key, i.e., ssh -i mykey.pem compute-node01. To set SLURM, I just added the compute node name to slurm.conf (via slurmd -C). Then, I copied the munge.key and the slurm.conf to all nodes so they are the same.
Currently, it is not working. I get munge credential not recognized. I wonder if it is because everytime I access to a node I must type ssh -i mykey.pem compute-node0X, i.e., I must use a private key to access to each node...

I have the following questions:

  • how does SLURM get access to the other nodes? I never registered any IP anywhere, I just added the node name using slurmd -C to slurm.conf, which according to me, doesn't say anything relevant to have a real connection. Is it because they share the munge.key and within this key there is some sort of ssh -i privatekey IP connection?
  • Is my SSH access with a key blocking SLURM and that's why I get credential not recognized?

Thanks

英文:

I'm curious about understanding how slurmctld access to the compute nodes or send info to them. I'm setting up SLURM so is not fully functional yet.

Currently I SSH to the compute node with a private key, i.e, ssh -i mykey.pem compute-node01. To set SLURM, I just added the compute node name to slurm.conf (via slurmd -C). Then, I copied the munge.key and the slurm.conf to all nodes so they are the same.
Currently, it is not working. I get munge credential not recognized. I wonder if it is because everytime I access to a node I must type ssh -i mykey.pem compute-node0X, i.e., I must use a private key to access to each node...

I have the following questions:

  • how does SLURM get access to the other nodes? I never registered any IP anywhere, I just added the node name using slurmd -C to slurm.conf, which according to me, doesn't say anything relevant to have a real connection. Is it because they share the munge.key and within this key there is some sort of ssh -i privatekey IP connection?
  • Is my SSH access with a key blocking SLURM and that's why I get credential not recognized?

Thanks

答案1

得分: 0

Slurm不使用SSH进行通信。一旦munge守护程序和slurmd守护程序启动运行,slurmd守护程序通过Slurm特定的端口与slurmctld守护程序通信,前提是:

  • 它们共享相同的munge密钥
  • 运行slurmd的服务器在运行slurmctld的服务器的slurm.conf中注册。
  • 防火墙不阻止在slurm端口(6818)上的通信。

mykey.pem SSH密钥与您的用户帐户相关,与Slurm无关。

英文:

Slurm does not use SSH to communicate. Once the munge daemon and the slurmd daemon are up and running, the slurmd daemons communicate with the slurmctld daemon through Slurm-specific ports provided that

  • they share the same munge key
  • the server running slurmd is registered in the slurm.conf of the server running slurmctld.
  • the firewall does not block communications on slurm port (6818)

The mykey.pem SSH key is related to your user account, not to Slurm.

huangapple
  • 本文由 发表于 2023年6月19日 16:37:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76504956.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定