2023年4月11日 05:25:28go评论153阅读模式

英文:

Pyspark sql issue in regexp_replace regexp_replace(COALESCE("Today | is | good | day", ''), '\\|', '>')

问题

I am facing issue with regex_replace function when it's been used in pyspark sql. I need to replace a Pipe symbol | with >, for example:

regexp_replace(COALESCE("Today | is | good | day", ''),  '\\|',  '&gt;')

input : Today | is | good | day
need the output : Today > is > good > day
But getting this : >T>o>d>a>y> >|> >i>s> >|> >g>o>o>d> >|> >d>a>y>

It's happening only when a sql is been written in

job_ctx.spark.sql('''select regexp_replace(COALESCE("Today | is | good | day", ''), '\\|','&gt;') as column''')

Can anyone suggest how to solve this in pyspark.sql ?

英文:

I am facing issue with regex_replace funcation when its been used in pyspark sql. I need to replace a Pipe symbol | with >, for example :

regexp_replace(COALESCE(&quot;Today | is | good | day&quot;, &#39;&#39;),  &#39;\\|&#39;,  &#39;&gt;&#39;)

input : Today | is | good | day
need the output : Today > is > good > day
But getting this : >T>o>d>a>y> >|> >i>s> >|> >g>o>o>d> >|> >d>a>y>

its happening only when a sql is been written in

job_ctx.spark.sql(&#39;&#39;&#39;select regexp_replace(COALESCE(&quot;Today | is | good | day&quot;, &#39;&#39;),&#39;\\|&#39;,&#39;&gt;&#39;) as column&#39;&#39;&#39;)

any one suggest how to solve this in pyspark.sql ?

答案1

得分: 1

尝试使用四个斜杠而不是两个 \\\\

spark.sql('select regexp_replace(COALESCE("Today | is | good | day", \'\'),\'\\\\|\',\'>\') as column').show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+

另一种方法是使用 translate 函数，这样我们就不需要转义

spark.sql('select translate(COALESCE("Today | is | good | day", \'\'),\'|\',\'>\') as column').show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today > is > good > day|
#+-----------------------+

英文:

Try with escaping with four slashes instead of two \\\\

spark.sql(&#39;&#39;&#39;select regexp_replace(COALESCE(&quot;Today | is | good | day&quot;, &#39;&#39;),&#39;\\\\|&#39;,&#39;&gt;&#39;) as column&#39;&#39;&#39;).show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today &gt; is &gt; good &gt; day|
#+-----------------------+

Other way would be using translate function so that we don't need to escape

spark.sql(&#39;&#39;&#39;select translate(COALESCE(&quot;Today | is | good | day&quot;, &#39;&#39;),&#39;|&#39;,&#39;&gt;&#39;) as column&#39;&#39;&#39;).show(10,False)
#+-----------------------+
#|column                 |
#+-----------------------+
#|Today &gt; is &gt; good &gt; day|
#+-----------------------+

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

“`python regexp_replace(COALESCE(“今天 | 是 | 好 | 日”, ”), ‘\\|’, ‘>’) “`

问题

答案1

全局数据库连接和每次打开连接之间的性能差异在Golang中是什么？

SQL两个表的Group By

UPDATE不会更新所有行。

SQL函数，返回有效数字作为varchar。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论