英文:
Volume Mount in SparkApplication resource not working
问题
我正在尝试在Kubernetes中使用Spark操作符,尝试创建一个具有以下清单的Spark应用程序资源。
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: pyspark-pi
namespace: spark-jobs
spec:
batchScheduler: volcano
batchSchedulerOptions:
priorityClassName: routine
type: Python
pythonVersion: "3"
mode: cluster
image: "<image_name>"
imagePullPolicy: Always
mainApplicationFile: local:///spark-files/csv_data.py
arguments:
- "10"
sparkVersion: "3.0.0"
restartPolicy:
type: OnFailure
onFailureRetries: 3
onFailureRetryInterval: 10
onSubmissionFailureRetries: 5
onSubmissionFailureRetryInterval: 20
timeToLiveSeconds: 86400
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 3.0.0
serviceAccount: driver-sa
volumeMounts:
- name: sparky-data
mountPath: /spark-data
executor:
cores: 1
instances: 2
memory: "512m"
labels:
version: 3.0.0
volumeMounts:
- name: sparky-data
mountPath: /spark-data
volumes:
- name: sparky-data
hostPath:
path: /spark-data
我正在kind中运行这个,我已经定义了一个卷挂载到我的本地系统,其中包含要处理的数据。我可以看到卷已经挂载到kind节点上。但当我创建上述资源时,驱动器Pod会崩溃,显示错误"没有这个路径"。我打印了驱动器Pod的根目录的内容,但看不到挂载的卷。问题在哪里,如何修复这个问题?
英文:
I am toying with the spark operator in kubernetes, and I am trying to create a Spark Application resource with the following manifest.
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: pyspark-pi
namespace: spark-jobs
spec:
batchScheduler: volcano
batchSchedulerOptions:
priorityClassName: routine
type: Python
pythonVersion: "3"
mode: cluster
image: "<image_name>"
imagePullPolicy: Always
mainApplicationFile: local:///spark-files/csv_data.py
arguments:
- "10"
sparkVersion: "3.0.0"
restartPolicy:
type: OnFailure
onFailureRetries: 3
onFailureRetryInterval: 10
onSubmissionFailureRetries: 5
onSubmissionFailureRetryInterval: 20
timeToLiveSeconds: 86400
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 3.0.0
serviceAccount: driver-sa
volumeMounts:
- name: sparky-data
mountPath: /spark-data
executor:
cores: 1
instances: 2
memory: "512m"
labels:
version: 3.0.0
volumeMounts:
- name: sparky-data
mountPath: /spark-data
volumes:
- name: sparky-data
hostPath:
path: /spark-data
I am running this in kind, where I have defined a volume mount to my local system where the data to be processed is present. I can see the volume being mounted in the kind nodes. But when I create the above resource, the driver pod crashes by giving the error 'no such path'. I printed the contents of the root directory of the driver pod and I could not see the mounted volume. What is the problem here and how do I fix this?
答案1
得分: 1
这个问题与权限有关。在将卷挂载到 pod 时,确保权限设置正确是很重要的。具体来说,需要确保在 pod 中运行应用程序的用户或组具有访问数据的正确权限。还要确保卷的路径有效,并且卷已正确挂载。要检查路径是否存在,可以使用 exec 命令:
kubectl exec <pod_name> -- ls
尝试添加安全上下文,为 Pod 提供特权和访问控制设置。
有关更多信息,请参阅文档。
英文:
The issue is related to permissions. When mounting a volume to a pod, you need to make sure that the permissions are set correctly. Specifically, you need to make sure that the user or group that is running the application in the pod has the correct permissions to access the data.You should also make sure that the path to the volume is valid, and that the volume is properly mounted.To check if a path exists, you can use the exec command:
kubectl exec <pod_name> -- ls
Try to add security context which gives privilege and access control settings for a Pod
For more information follow this document.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论