英文:
Pyspark sql issue in regexp_replace regexp_replace(COALESCE("Today | is | good | day", ''), '\\|', '>')
问题
I am facing issue with regex_replace function when it's been used in pyspark sql. I need to replace a Pipe symbol |
with >
, for example:
regexp_replace(COALESCE("Today | is | good | day", ''), '\\|', '>')
input : Today | is | good | day
need the output : Today > is > good > day
But getting this : >T>o>d>a>y> >|> >i>s> >|> >g>o>o>d> >|> >d>a>y>
It's happening only when a sql is been written in
job_ctx.spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''), '\\|','>') as column''')
Can anyone suggest how to solve this in pyspark.sql ?
英文:
I am facing issue with regex_replace funcation when its been used in pyspark sql. I need to replace a Pipe symbol |
with >
, for example :
regexp_replace(COALESCE("Today | is | good | day", ''), '\\|', '>')
input : Today | is | good | day
need the output : Today > is > good > day
But getting this : >T>o>d>a>y> >|> >i>s> >|> >g>o>o>d> >|> >d>a>y>
its happening only when a sql is been written in
job_ctx.spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''),'\\|','>') as column''')
any one suggest how to solve this in pyspark.sql ?
答案1
得分: 1
尝试使用四个斜杠而不是两个 \\\\
spark.sql('select regexp_replace(COALESCE("Today | is | good | day", \'\'),\'\\\\|\',\'>\') as column').show(10,False)
#+-----------------------+
#|column |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+
另一种方法是使用 translate
函数,这样我们就不需要转义
spark.sql('select translate(COALESCE("Today | is | good | day", \'\'),\'|\',\'>\') as column').show(10,False)
#+-----------------------+
#|column |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+
英文:
Try with escaping with four slashes instead of two \\\\
spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''),'\\\\|','>') as column''').show(10,False)
#+-----------------------+
#|column |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+
Other way would be using translate
function so that we don't need to escape
spark.sql('''select translate(COALESCE("Today | is | good | day", ''),'|','>') as column''').show(10,False)
#+-----------------------+
#|column |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论