如何在 PySpark 数据帧中更改具有数组结构的列值

huangapple go评论78阅读模式
英文:

How to change a column value in the PySpark dataframe with a datatype of an array of structs

问题

如何在 PySpark 数据框中更改列值,列的数据类型为arraystructs,例如,我想将long_value除以10。

root
 |-- properties: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- key: string (nullable = true)
 |    |    |-- value: struct (nullable = true)
 |    |    |    |-- string_value: string (nullable = true)
 |    |    |    |-- long_value: long (nullable = true)

我尝试使用withColumn方法来实现,但这种方法返回相同的数据框。

df.withColumn("properties.value.long_value", col("properties")[0]["value"]["long_value"] / 10 )
英文:

How to change a column value in the PySpark dataframe with a datatype of an array of structs, for example, I would like to divide long_value by 10.

root
 |-- properties: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- key: string (nullable = true)
 |    |    |-- value: struct (nullable = true)
 |    |    |    |-- string_value: string (nullable = true)
 |    |    |    |-- long_value: long (nullable = true)

I tried to achieve it by using the withColumn method but the approach returns the same df.

df.withColumn("properties.value.long_value", col("properties")[0]["value"]["long_value"] / 10 )

答案1

得分: 1

使用transform函数与列方法withField

df1 = df.withColumn(
    "properties",
    F.transform(
        "properties",
        lambda x: x.withField("value", x["value"].withField("long_value", x["value"].getField("long_value") / 10))
    )
)
英文:

Using transform function along with column method withField:

df1 = df.withColumn(
    "properties",
    F.transform(
        "properties",
        lambda x: x.withField("value", x["value"].withField("long_value", x["value"].getField("long_value") / 10))
    )
)

huangapple
  • 本文由 发表于 2023年2月23日 20:21:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/75544761.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定