如何在Scala的字符串列中将\"替换为"

huangapple go评论45阅读模式
英文:

How to replace \" with " in string columns scala

问题

{
"translation": "我有一列格式混乱的嵌套JSON字符串,我正在尝试使用regexp_replace来编辑它,以便Scala可以将该列读取为一个结构。JSON中的每个项目都添加了随机的\",我希望用"替换它们。\n\n我已经使用regexp_replace删除了\n,但我在处理每个项目周围的\"时遇到了困难。\n\nvar newdf = df.withcolumn( "clean_json", regexp_replace(regexp_replace(col("json"), "\n", ""), """"""", "\""))\n\n我尝试过使用\\\"""作为转义字符,但似乎都不起作用。"
}

英文:

I have a column of badly formatted nested json string that I'm trying to edit with regexp_replace. So that the column can be read by scala as a struct. There are random \" added to each item in the Json that i want to replace with "

{
{"id": "1", "json": [
{ \"details\": {\n \"name\" : \"john\", \n \"lastname\" : \"doe"\ \n},
\"location\": {\n  \"city\" : \"new york\", \n \"country\" : \"usa\" \n} },
{ \"details\": {\n \"name\" : \"jane\", \n \"lastname\" : \"random"\ \n},
\"location\": {\n  \"city\" : \"new york\", \n \"country\" : \"usa\" \n} },
] },
{"id": "2", "json": [
{ \"details\": {\n \"name\" : \"jack\", \n \"lastname\" : \"ryan"\ \n},
\"location\": {\n  \"city\" : \"york\", \n \"country\" : \"uk\" \n} },
{ \"details\": {\n \"name\" : \"jill\", \n \"lastname\" : \"test"\ \n},
\"location\": {\n  \"city\" : \"LA\", \n \"country\" : \"usa\" \n} },
] }
}

I was able to remove the \n with regexp_replace but i'm struggling with the \" wrapping each item.

var newdf = df.withcolumn( "clean_json",
  regexp_replace(regexp_replace(col("json"), "\n", ""), """\"""", "\\\""))

I've tried using both \\\ and """ as escape characters. nothing seems to work

答案1

得分: 0

如果你想替换\,你应该替换为\\\\

df = spark.createDataFrame(['\\"'], StringType()).toDF('value')
df.withColumn('new_value', f.regexp_replace('value', '\\\\"', '')).show(truncate=False)

+-----+---------+
|value|new_value|
+-----+---------+
|\"   |         |
+-----+---------+
英文:

If you want to replace \, you should replace with \\\\.

df = spark.createDataFrame(['\\"'], StringType()).toDF('value')
df.withColumn('new_value', f.regexp_replace('value', '\\\\"', '')).show(truncate=False)

+-----+---------+
|value|new_value|
+-----+---------+
|\"   |         |
+-----+---------+

huangapple
  • 本文由 发表于 2023年2月8日 13:41:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/75381746.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定