如何将以字符串格式表示的字典转换为Scala中的表格数据框?

huangapple go评论57阅读模式
英文:

How to convert a dictionary which is in string format to tabular dataframe in scala?

问题

我有一个返回字符串的方法,其值类似于字典。例如,类型是字符串,返回值如下:

{"firstName":"bb288e8ff56b","lastName":"ae4863bdae026314"}

我想将这个字符串转换成一个DataFrame,其中将包含两列,分别是firstName和lastName。

目前我只能将它存储为一个DataFrame中的单个列,使用.toDF()方法:

val df = Seq(returnString).toDF("record")

有人可以帮忙吗?

英文:

I have an method which return a string and the value is like dictionary. E.g type is string and the return value is:

{"firstName":"bb288e8ff56b","lastName":"ae4863bdae026314"}

I want to convert this to a dataframe which will have two column firstName and LastName.

For now i am only able to store it as a single column in dataframe using .toDF()

val df=Seq(retrunString).toDF("record");

Can some one help on this.

答案1

得分: 2

import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import spark.implicits._

val jsonString = """{"firstName":"bb288e8ff56b","lastName":"ae4863bdae026314"}"""

val df = Seq(jsonString).toDF("record")

val schema = StructType(
Seq(
StructField("firstName", StringType),
StructField("lastName", StringType)
)
)

val parsedDf = df
.select(from_json(col("record"), schema).as("parsed"))
.select("parsed.firstName", "parsed.lastName")

parsedDf.show()

+------------+----------------+
| firstName| lastName|
+------------+----------------+
|bb288e8ff56b|ae4863bdae026314|
+------------+----------------+

英文:

You can use the from_json function from Spark's functions package to parse the JSON string into a struct:

import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
import spark.implicits._

val jsonString = """{"firstName":"bb288e8ff56b","lastName":"ae4863bdae026314"}"""

val df = Seq(jsonString).toDF("record")

val schema = StructType(
  Seq(
    StructField("firstName", StringType),
    StructField("lastName", StringType)
  )
)

val parsedDf = df
  .select(from_json(col("record"), schema).as("parsed"))
  .select("parsed.firstName", "parsed.lastName")

parsedDf.show()

+------------+----------------+
|   firstName|        lastName|
+------------+----------------+
|bb288e8ff56b|ae4863bdae026314|
+------------+----------------+

huangapple
  • 本文由 发表于 2023年3月31日 16:27:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/75896383.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定