MLRun, ErrorMessage, No space left on device

huangapple go评论70阅读模式
英文:

MLRun, ErrorMessage, No space left on device

问题

I got this error during ingest data to FeatureSet:

错误 - 无法保存 /k78/online_detail/nosql/sets/on line_detail/0354467518.ed74fc2b 的聚合数据
响应状态码为 400: b'{
"ErrorCode": -28,
"ErrorMessage": "设备上没有剩余空间"
}
更新表达式为:pr_ph='0354467518';id=7309877;type='r77'

我使用了标准的摄取代码,如下:

import mlrun
import mlrun.feature_store as fs
...
project = mlrun.get_or_create_project(project_name, context='./', user_project=False)
feature_set=featureGetOrCreate(True, project_name, 'sample')
...
fs.ingest(feature_set, df)

似乎是磁盘空间的问题,但我百分之百确定我有足够的剩余空间来进行摄取(问题可能不同)。你是否遇到类似的问题?

英文:

I got this error during ingest data to FeatureSet:

Error - Failed to save aggregation for /k78/online_detail/nosql/sets/on line_detail/0354467518.ed74fc2b
Response status code was 400: b'{\n\t"ErrorCode": -28,\n\t"ErrorMessage": "No space left on device"\n} 
Update expression was: pr_ph='0354467518';id=7309877;type='r77'

I used standard code for ingestion, see:

import mlrun
import mlrun.feature_store as fs
...
project = mlrun.get_or_create_project(project_name, context='./', user_project=False)
feature_set=featureGetOrCreate(True, project_name, 'sample')
...
fs.ingest(feature_set, df)

It seems as the issue with disk space, but I am 100% sure that I had enough free space for ingest (it will be something different). Did you have the similar issue?

答案1

得分: 0

问题出在数据节点一侧的对象数量上,与这些平台的限制有关。

可以运行**HealthCheckScript(hcs)**并查看v3io容器中的对象数量,查看以下命令(带有设置):

hcs -v --dark --test check_cluster_engine_number_of_objects

然后你会看到这些输出:

...
[2023-05-25 20:30:20] [TASK] [check_cluster_engine_number_of_objects] 检查集群中对象的数量。
[2023-05-25 20:30:20] [SUBTASK] 检查数据集群的运行状态...
[2023-05-25 20:30:21] [INFO] 数据集群在线。
[2023-05-25 20:30:21] [SUBTASK] 计算每个容器中的对象数量...
[2023-05-25 20:30:30] [INFO] +----------------+------+----------+
[2023-05-25 20:30:30] [INFO] | 容器名称       | ID   | 项目数   |
[2023-05-25 20:30:30] [INFO] +----------------+------+----------+
[2023-05-25 20:30:30] [INFO] | 项目           | 1033 | 28715350 |
[2023-05-25 20:30:30] [INFO] | 用户           | 1034 | 179598   |
[2023-05-25 20:30:30] [INFO] | 大数据         | 1035 | 271      |
[2023-05-25 20:30:30] [INFO] | 用户           | 1036 | 285995   |
[2023-05-25 20:30:30] [INFO] | 总计           |      | 29181214 |
[2023-05-25 20:30:30] [INFO] +----------------+------+----------+
...

顺便说一下,hcs命令可以在数据节点一侧使用。

英文:

The issue was in number of objects on side of data nodes and it has relation to these platform limits.

It is possible to run the HealthCheckScript (hcs) and see number of objects in v3io containers, see the command (with setting):

hcs -v --dark --test check_cluster_engine_number_of_objects

and you see these outputs:

...
[2023-05-25 20:30:20] [TASK] [check_cluster_engine_number_of_objects] Checking the number of objects in the cluster.
[2023-05-25 20:30:20] [SUBTASK] Checking the data cluster's operational status...
[2023-05-25 20:30:21] [INFO] The data cluster is online.
[2023-05-25 20:30:21] [SUBTASK] Counting the number of objects in each container...
[2023-05-25 20:30:30] [INFO] +----------------+------+----------+
[2023-05-25 20:30:30] [INFO] | Container Name | ID   | Items    |
[2023-05-25 20:30:30] [INFO] +----------------+------+----------+
[2023-05-25 20:30:30] [INFO] | projects       | 1033 | 28715350 |
[2023-05-25 20:30:30] [INFO] | users          | 1034 | 179598   |
[2023-05-25 20:30:30] [INFO] | bigdata        | 1035 | 271      |
[2023-05-25 20:30:30] [INFO] | users          | 1036 | 285995   |
[2023-05-25 20:30:30] [INFO] | Total          |      | 29181214 |
[2023-05-25 20:30:30] [INFO] +----------------+------+----------+
...

BTW: The hcs command is available on side of data nodes.

huangapple
  • 本文由 发表于 2023年5月25日 22:25:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76333373.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定