英文:
Extracting SLA information from PostgreSQL table with text matching and value extraction
问题
我需要检查字段description是否包含以下任一文本:
- This average processing time for this type of request is 5 days.
 - This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.
 
如果存在任一文本,则将该字段(SLA text)标记为“Yes”,否则标记为“No”。
如果将字段标记为“Yes”,则从文本中检索整数值。
例如,如果文本是 blabla This average processing time for this type of request is 5 days. blabla,要检索的值是 5。如果文本是 This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.,要检索的值是 1/5。此检索到的值应存储在days列中。
英文:
I have the following postgresql :
CREATE TABLE test (
  id INT,
  description TEXT
);
INSERT INTO test VALUES 
(1, 'Some text'),
(2, '123 blabla The average processing time for this type of request is 5 days. blabla'),
(3, 'blalbla The average processing time for this type of request is 5 days.
This delay is reduced to 1 day for requests with high or critical priority. blabla'),
(4, 'blalbla The average processing time for this type of request is 7 days.
This delay is reduced to 2 day for requests with high or critical priority. blabla'),
(5, 'blabla The average processing time for this type of request is 3 days. blabla');
I need to get the following output :
| ID | SLA text | days | 
|---|---|---|
| 1 | No | |
| 2 | Yes | 5 | 
| 3 | Yes | 1/5 | 
| 4 | Yes | 2/7 | 
| 5 | Yes | 3 | 
For the moment, I'm able to tell if the field contain a SLA value using :
SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END AS "SLA text"
FROM 
    test
I need to check if the field description contains either of the following text:
- The average processing time for this type of request is 5 days.
 - The average processing time for this type of request is 5 days. This
delay is reduced to 1 day for requests with high or critical
priority. 
If either text is present, then mark the field (SLA text) as "Yes", otherwise mark it as "No".
If the field is marked as "Yes", then retrieve the integer value from the text.
For example, if the text is blabla The average processing time for this type of request is 5 days. blabla, the value to be retrieved is 5. If the text is The average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority., the value to be retrieved is 1/5. This retrieved value should be stored in the days column.
答案1
得分: 1
你也可以使用 regex_replace:
select id, "SLA text",
	case when "SLA text" = 'Yes' then
    	trim(leading '/' from regexp_replace(text,'(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*' , '\2/\1', 'g'))
    else '' end "SLA2 text"
from(
  SELECT 
      id,
      CASE 
          WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
          THEN 'Yes'
          ELSE 'No'
      END AS "SLA text",
  	  regexp_replace(description, ' ', ' ', 'g') text
  FROM 
      test
) t
英文:
You can use regex_replace for this too:
select id, "SLA text",
	case when "SLA text" = 'Yes' then
    	trim(leading '/' from regexp_replace(text,'(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*' , '\2/\1', 'g'))
    else '' end "SLA2 text"
from(
  SELECT 
      id,
      CASE 
          WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
          THEN 'Yes'
          ELSE 'No'
      END AS "SLA text",
  	  regexp_replace(description, ' ', ' ', 'g') text
  FROM 
      test
) t
Here we replace your pattern with found digits from this pattern. And trim leading slash for case when second number is missing.
答案2
得分: 1
您可以使用正则表达式和子字符串功能。
英文:
You can use substirng with regular expressions
SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END AS "SLA text",
  substring(description, '([0-9]*) day') days
FROM 
    test
where substring(description, '([0-9]*) day') IS NULL
| id | SLA text | days | 
|---|---|---|
| 1 | No | null | 
| > ``` status | ||
| > SELECT 1 | ||
| > ``` | 
SELECT 
    id,
    CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END AS "SLA text",
  substring(description, '([0-9]*) day') days
FROM 
    test
where substring(description, '([0-9]*) day') IS NULL
UNION ALL
select 
  id,
    MAX(CASE 
        WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%' 
        THEN 'Yes'
        ELSE 'No'
    END) AS "SLA text",  
  STRING_AGG(match[1], '/' ORDER BY match[1]) as days
from test
cross join lateral regexp_matches(description, '([0-9]*) day', 'g') as match
Group by id
| id | SLA text | days | 
|---|---|---|
| 1 | No | null | 
| 2 | Yes | 5 | 
| 3 | Yes | 1/5 | 
| 4 | Yes | 2/7 | 
| 5 | Yes | 3 | 
| > ``` status | ||
| > SELECT 5 | ||
| > ``` | 
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论