如何使用DeltaTable API在PySpark中设置Delta表的表属性。

huangapple go评论55阅读模式
英文:

How we can set table properties for delta table in pyspark using DeltaTable API

问题

以下是我正在尝试在 PySpark 中使用的代码:

from delta import DeltaTable

delta_table = DeltaTable.forPath(spark, delta_table_path)
delta_table.logRetentionDuration = "interval 1 days"

在这之后,我们需要保存这个配置吗,还是它会自动生效?我们如何检查表的当前 logRetentionDuration 设置是多少呢?
我尝试了以下方法以获取属性信息:

delta_table.detail()

但它返回空的 {}。

英文:

Below is the code that I am trying in PySpark

from delta import DeltaTable

delta_table = DeltaTable.forPath(spark, delta_table_path)
delta_table.logRetentionDuration = "interval 1 days"

After this do we need to save this config or it will be applicable automatically. How we can check what is current logRetentionDuration set for table.
I tried below to get properties info

delta_table.detail()

But it return empty {}

答案1

得分: 1

Use Spark SQL:

spark.sql("ALTER TABLE delta.`path\to\delta\table` SET TBLPROPERTIES ('delta.logRetentionDuration'='1 days')")
spark.sql("DESCRIBE DETAIL delta.`path\to\delta\table\path`").show(truncate=False)

在***"properties"***列下,您可以看到指定Delta表的日志保留期。

注意:如果保留期未正确设置,上述SQL命令将在"properties"列中返回{}

英文:

Use Spark SQL:

spark.sql("ALTER TABLE delta.`path\to\delta\table` SET TBLPROPERTIES ('delta.logRetentionDuration'='1 days')")
spark.sql("DESCRIBE DETAIL delta.`path\to\delta\table\path`").show(truncate=False)

Under the "properties" column you can see the log retention duration of the specified delta table.

Note: If the retention period was not set properly, the above SQL command returns {} in the properties column.

huangapple
  • 本文由 发表于 2023年3月15日 17:44:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/75742945.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定