“`python regexp_replace(COALESCE(“今天 | 是 | 好 | 日”, ”), ‘\\|’, ‘>’) “`

huangapple go评论65阅读模式
英文:

Pyspark sql issue in regexp_replace regexp_replace(COALESCE("Today | is | good | day", ''), '\\|', '>')

问题

I am facing issue with regex_replace function when it's been used in pyspark sql. I need to replace a Pipe symbol | with >, for example:

regexp_replace(COALESCE("Today | is | good | day", ''),  '\\|',  '>') 

input : Today | is | good | day
need the output : Today > is > good > day
But getting this : >T>o>d>a>y> >|> >i>s> >|> >g>o>o>d> >|> >d>a>y>

It's happening only when a sql is been written in

job_ctx.spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''), '\\|','>') as column''')

Can anyone suggest how to solve this in pyspark.sql ?

英文:

I am facing issue with regex_replace funcation when its been used in pyspark sql. I need to replace a Pipe symbol | with >, for example :

regexp_replace(COALESCE("Today | is | good | day", ''),  '\\|',  '>') 

input : Today | is | good | day
need the output : Today > is > good > day
But getting this : >T>o>d>a>y> >|> >i>s> >|> >g>o>o>d> >|> >d>a>y>

its happening only when a sql is been written in

job_ctx.spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''),'\\|','>') as column''')

any one suggest how to solve this in pyspark.sql ?

答案1

得分: 1

尝试使用四个斜杠而不是两个 \\\\

spark.sql('select regexp_replace(COALESCE("Today | is | good | day", \'\'),\'\\\\|\',\'>\') as column').show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+

另一种方法是使用 translate 函数,这样我们就不需要转义

spark.sql('select translate(COALESCE("Today | is | good | day", \'\'),\'|\',\'>\') as column').show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+
英文:

Try with escaping with four slashes instead of two \\\\

spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''),'\\\\|','>') as column''').show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+

Other way would be using translate function so that we don't need to escape

spark.sql('''select translate(COALESCE("Today | is | good | day", ''),'|','>') as column''').show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+

huangapple
  • 本文由 发表于 2023年4月11日 05:25:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/75980860.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定