选择字符串列的特定部分。

huangapple go评论51阅读模式
英文:

Select only specific part of STRING column

问题

我有一张包含“Description”列的表,我需要查询该表并仅获取“Description”列的特定部分。
以下是“Description”列中的一些示例:

|DESCRIPTION|

|someRandomt|
|it has 0.5g|
|23g is enou|
|otherRandom|
|55g, 0.1g, |

在SELECT语句中,我需要仅选择带有字母“g”的不同数值(浮点数或整数),如果有多个,我需要使用“/”将它们连接起来。在这种情况下,最终结果应该如下:

|DESCRIPTION|

|null |
|0.5g |
|23g |
|null |
|55g/0.1g |

英文:

I have a table with a column "Description", I need to query that table and get only specific part of the Description column.
Here a few examples of what's inside the 'Description' column:

|DESCRIPTION|
-------------
|someRandomt|
|it has 0.5g|
|23g is enou|
|otherRandom|
|55g, 0.1g, |

In the SELECT statement, I need to select only the different numeric values (float or int) with the letter 'g' behind it and if there are more than one, I need to concatenate them using '/'. The final result in this case should be like this:

|DESCRIPTION|
-------------
|null       |
|0.5g       |
|23g        |
|null       |
|55g/0.1g   |

答案1

得分: 1

使用regexp_extract_all提取带有"g"的数字,然后使用array_join将它们合并为一个字符串。

以下是PySpark的示例代码:

from pyspark.sql import functions as f

df.withColumn('nums', f.array_join(f.expr('regexp_extract_all(DESCRIPTION, "([0-9]+(.[0-9]+)?g)")'), '/')) \
  .show()

+-----------+--------+
|DESCRIPTION| nums|
+-----------+--------+
|someRandomt| |
|it has 0.5g| 0.5g|
|23g is enou| 23g|
|otherRandom| |
|55g, 0.1g, |55g/0.1g|
+-----------+--------+


<details>
<summary>英文:</summary>

Extract the numbers with g by `regexp_extract_all` and then `array_join` to get a single string.

Here is a code example for PySpark.

from pyspark.sql import functions as f

df.withColumn('nums', f.array_join(f.expr('regexp_extract_all(DESCRIPTION, "([0-9]+(.[0-9]+)?g)")'), '/'))
.show()

+-----------+--------+
|DESCRIPTION| nums|
+-----------+--------+
|someRandomt| |
|it has 0.5g| 0.5g|
|23g is enou| 23g|
|otherRandom| |
|55g, 0.1g, |55g/0.1g|
+-----------+--------+


</details>



huangapple
  • 本文由 发表于 2023年3月7日 23:29:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75663971.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定