2023年3月31日 15:50:53go评论72阅读模式

英文:

Extracting SLA information from PostgreSQL table with text matching and value extraction

问题

我需要检查字段description是否包含以下任一文本：

This average processing time for this type of request is 5 days.
This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.

如果存在任一文本，则将该字段（SLA text）标记为“Yes”，否则标记为“No”。

如果将字段标记为“Yes”，则从文本中检索整数值。

例如，如果文本是 blabla This average processing time for this type of request is 5 days. blabla，要检索的值是 5。如果文本是 This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.，要检索的值是 1/5。此检索到的值应存储在days列中。

英文:

I have the following postgresql :

CREATE TABLE test (
  id INT,
  description TEXT
);

INSERT INTO test VALUES 
(1, &#39;Some text&#39;),
(2, &#39;123 blabla The&amp;nbsp;average processing time for this type of request is 5 days. blabla&#39;),
(3, &#39;blalbla The&amp;nbsp;average processing time for this type of request is 5 days.
This delay is reduced to 1 day for requests with high or critical priority. blabla&#39;),
(4, &#39;blalbla The average processing time for this type of request is 7 days.
This delay is reduced to 2 day for requests with high or critical priority. blabla&#39;),
(5, &#39;blabla The average processing time for this type of request is 3 days. blabla&#39;);

I need to get the following output :

ID	SLA text	days
1	No
2	Yes	5
3	Yes	1/5
4	Yes	2/7
5	Yes	3

For the moment, I'm able to tell if the field contain a SLA value using :

SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, &#39;&amp;nbsp;&#39;, &#39; &#39;, &#39;g&#39;) ILIKE &#39;%The average processing%&#39; 
        THEN &#39;Yes&#39;
        ELSE &#39;No&#39;
    END AS &quot;SLA text&quot;
FROM 
    test

I need to check if the field description contains either of the following text:

The average processing time for this type of request is 5 days.
The average processing time for this type of request is 5 days. This
delay is reduced to 1 day for requests with high or critical
priority.

If either text is present, then mark the field (SLA text) as "Yes", otherwise mark it as "No".

If the field is marked as "Yes", then retrieve the integer value from the text.

For example, if the text is blabla The average processing time for this type of request is 5 days. blabla, the value to be retrieved is 5. If the text is The average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority., the value to be retrieved is 1/5. This retrieved value should be stored in the days column.

Demo : https://www.db-fiddle.com/f/iNxLeZosApNzTyp9RNTK4r/1

答案1

得分: 1

你也可以使用 regex_replace：

select id, &quot;SLA text&quot;,
	case when &quot;SLA text&quot; = &#39;Yes&#39; then
    	trim(leading &#39;/&#39; from regexp_replace(text,&#39;(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*&#39; , &#39;\2/\1&#39;, &#39;g&#39;))
    else &#39;&#39; end &quot;SLA2 text&quot;
from(
  SELECT 
      id,
      CASE 
          WHEN regexp_replace(description, &#39;&amp;nbsp;&#39;, &#39; &#39;, &#39;g&#39;) ILIKE &#39;%The average processing%&#39; 
          THEN &#39;Yes&#39;
          ELSE &#39;No&#39;
      END AS &quot;SLA text&quot;,
  	  regexp_replace(description, &#39;&amp;nbsp;&#39;, &#39; &#39;, &#39;g&#39;) text
  FROM 
      test
) t

英文:

You can use regex_replace for this too:

select id, &quot;SLA text&quot;,
	case when &quot;SLA text&quot; = &#39;Yes&#39; then
    	trim(leading &#39;/&#39; from regexp_replace(text,&#39;(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*&#39; , &#39;\2/\1&#39;, &#39;g&#39;))
    else &#39;&#39; end &quot;SLA2 text&quot;
from(
  SELECT 
      id,
      CASE 
          WHEN regexp_replace(description, &#39;&amp;nbsp;&#39;, &#39; &#39;, &#39;g&#39;) ILIKE &#39;%The average processing%&#39; 
          THEN &#39;Yes&#39;
          ELSE &#39;No&#39;
      END AS &quot;SLA text&quot;,
  	  regexp_replace(description, &#39;&amp;nbsp;&#39;, &#39; &#39;, &#39;g&#39;) text
  FROM 
      test
) t

Here we replace your pattern with found digits from this pattern. And trim leading slash for case when second number is missing.

答案2

得分: 1

您可以使用正则表达式和子字符串功能。

英文:

You can use substirng with regular expressions

SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, &#39;&#160;&#39;, &#39; &#39;, &#39;g&#39;) ILIKE &#39;%The average processing%&#39; 
        THEN &#39;Yes&#39;
        ELSE &#39;No&#39;
    END AS &quot;SLA text&quot;,
  substring(description, &#39;([0-9]*) day&#39;) days
FROM 
    test
where substring(description, &#39;([0-9]*) day&#39;) IS NULL

id	SLA text	days
1	No	null
> ``` status
> SELECT 1
> ```

SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, &#39;&#160;&#39;, &#39; &#39;, &#39;g&#39;) ILIKE &#39;%The average processing%&#39; 
        THEN &#39;Yes&#39;
        ELSE &#39;No&#39;
    END AS &quot;SLA text&quot;,
  substring(description, &#39;([0-9]*) day&#39;) days
FROM 
    test
where substring(description, &#39;([0-9]*) day&#39;) IS NULL
UNION ALL
select 
  id,
    MAX(CASE 
        WHEN regexp_replace(description, &#39;&#160;&#39;, &#39; &#39;, &#39;g&#39;) ILIKE &#39;%The average processing%&#39; 
        THEN &#39;Yes&#39;
        ELSE &#39;No&#39;
    END) AS &quot;SLA text&quot;,  
  STRING_AGG(match[1], &#39;/&#39; ORDER BY match[1]) as days
from test
cross join lateral regexp_matches(description, &#39;([0-9]*) day&#39;, &#39;g&#39;) as match
Group by id

id	SLA text	days
1	No	null
2	Yes	5
3	Yes	1/5
4	Yes	2/7
5	Yes	3
> ``` status
> SELECT 5
> ```

fiddle

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从 PostgreSQL 表中提取 SLA 信息，使用文本匹配和数值提取。

问题

答案1

答案2

用JOIN替换子查询的SQL

Java中的Date映射到PostgreSQL的timestamp没有时间、分钟和秒。

BULK INSERT/UPDATE语句语法错误？

CASE WHEN在WHERE条件逻辑中

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论