从 PostgreSQL 表中提取 SLA 信息,使用文本匹配和数值提取。

huangapple go评论57阅读模式
英文:

Extracting SLA information from PostgreSQL table with text matching and value extraction

问题

我需要检查字段description是否包含以下任一文本:

  • This average processing time for this type of request is 5 days.
  • This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.

如果存在任一文本,则将该字段(SLA text)标记为“Yes”,否则标记为“No”。

如果将字段标记为“Yes”,则从文本中检索整数值。

例如,如果文本是 blabla This average processing time for this type of request is 5 days. blabla,要检索的值是 5。如果文本是 This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.,要检索的值是 1/5。此检索到的值应存储在days列中。

英文:

I have the following postgresql :

CREATE TABLE test (
  id INT,
  description TEXT
);

INSERT INTO test VALUES 
(1, 'Some text'),
(2, '123 blabla The average processing time for this type of request is 5 days. blabla'),
(3, 'blalbla The average processing time for this type of request is 5 days.
This delay is reduced to 1 day for requests with high or critical priority. blabla'),
(4, 'blalbla The average processing time for this type of request is 7 days.
This delay is reduced to 2 day for requests with high or critical priority. blabla'),
(5, 'blabla The average processing time for this type of request is 3 days. blabla');

I need to get the following output :

ID SLA text days
1 No
2 Yes 5
3 Yes 1/5
4 Yes 2/7
5 Yes 3

For the moment, I'm able to tell if the field contain a SLA value using :

SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END AS "SLA text"
FROM 
    test

I need to check if the field description contains either of the following text:

  • The average processing time for this type of request is 5 days.
  • The average processing time for this type of request is 5 days. This
    delay is reduced to 1 day for requests with high or critical
    priority.

If either text is present, then mark the field (SLA text) as "Yes", otherwise mark it as "No".

If the field is marked as "Yes", then retrieve the integer value from the text.

For example, if the text is blabla The average processing time for this type of request is 5 days. blabla, the value to be retrieved is 5. If the text is The average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority., the value to be retrieved is 1/5. This retrieved value should be stored in the days column.

Demo : https://www.db-fiddle.com/f/iNxLeZosApNzTyp9RNTK4r/1

答案1

得分: 1

你也可以使用 regex_replace

select id, "SLA text",
	case when "SLA text" = 'Yes' then
    	trim(leading '/' from regexp_replace(text,'(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*' , '\2/\1', 'g'))
    else '' end "SLA2 text"
from(
  SELECT 
      id,
      CASE 
          WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
          THEN 'Yes'
          ELSE 'No'
      END AS "SLA text",
  	  regexp_replace(description, ' ', ' ', 'g') text
  FROM 
      test
) t
英文:

You can use regex_replace for this too:

select id, "SLA text",
	case when "SLA text" = 'Yes' then
    	trim(leading '/' from regexp_replace(text,'(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*' , '\2/\1', 'g'))
    else '' end "SLA2 text"
from(
  SELECT 
      id,
      CASE 
          WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
          THEN 'Yes'
          ELSE 'No'
      END AS "SLA text",
  	  regexp_replace(description, ' ', ' ', 'g') text
  FROM 
      test
) t

Here we replace your pattern with found digits from this pattern. And trim leading slash for case when second number is missing.

答案2

得分: 1

您可以使用正则表达式和子字符串功能。

英文:

You can use substirng with regular expressions

SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END AS "SLA text",
  substring(description, '([0-9]*) day') days
FROM 
    test
where substring(description, '([0-9]*) day') IS NULL
id SLA text days
1 No null
> ``` status
> SELECT 1
> ```
SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END AS "SLA text",
  substring(description, '([0-9]*) day') days
FROM 
    test
where substring(description, '([0-9]*) day') IS NULL
UNION ALL
select 
  id,
    MAX(CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END) AS "SLA text",  
  STRING_AGG(match[1], '/' ORDER BY match[1]) as days
from test
cross join lateral regexp_matches(description, '([0-9]*) day', 'g') as match
Group by id
id SLA text days
1 No null
2 Yes 5
3 Yes 1/5
4 Yes 2/7
5 Yes 3
> ``` status
> SELECT 5
> ```

fiddle

huangapple
  • 本文由 发表于 2023年3月31日 15:50:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75896090.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定