英文:
Extracting SLA information from PostgreSQL table with text matching and value extraction
问题
我需要检查字段description
是否包含以下任一文本:
- This average processing time for this type of request is 5 days.
- This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.
如果存在任一文本,则将该字段(SLA text)标记为“Yes”,否则标记为“No”。
如果将字段标记为“Yes”,则从文本中检索整数值。
例如,如果文本是 blabla This average processing time for this type of request is 5 days. blabla
,要检索的值是 5
。如果文本是 This average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.
,要检索的值是 1/5
。此检索到的值应存储在days
列中。
英文:
I have the following postgresql :
CREATE TABLE test (
id INT,
description TEXT
);
INSERT INTO test VALUES
(1, 'Some text'),
(2, '123 blabla The average processing time for this type of request is 5 days. blabla'),
(3, 'blalbla The average processing time for this type of request is 5 days.
This delay is reduced to 1 day for requests with high or critical priority. blabla'),
(4, 'blalbla The average processing time for this type of request is 7 days.
This delay is reduced to 2 day for requests with high or critical priority. blabla'),
(5, 'blabla The average processing time for this type of request is 3 days. blabla');
I need to get the following output :
ID | SLA text | days |
---|---|---|
1 | No | |
2 | Yes | 5 |
3 | Yes | 1/5 |
4 | Yes | 2/7 |
5 | Yes | 3 |
For the moment, I'm able to tell if the field contain a SLA value using :
SELECT
id,
CASE
WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%'
THEN 'Yes'
ELSE 'No'
END AS "SLA text"
FROM
test
I need to check if the field description
contains either of the following text:
- The average processing time for this type of request is 5 days.
- The average processing time for this type of request is 5 days. This
delay is reduced to 1 day for requests with high or critical
priority.
If either text is present, then mark the field (SLA text) as "Yes", otherwise mark it as "No".
If the field is marked as "Yes", then retrieve the integer value from the text.
For example, if the text is blabla The average processing time for this type of request is 5 days. blabla
, the value to be retrieved is 5
. If the text is The average processing time for this type of request is 5 days. This delay is reduced to 1 day for requests with high or critical priority.
, the value to be retrieved is 1/5
. This retrieved value should be stored in the days
column.
答案1
得分: 1
你也可以使用 regex_replace
:
select id, "SLA text",
case when "SLA text" = 'Yes' then
trim(leading '/' from regexp_replace(text,'(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*' , '\2/\1', 'g'))
else '' end "SLA2 text"
from(
SELECT
id,
CASE
WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%'
THEN 'Yes'
ELSE 'No'
END AS "SLA text",
regexp_replace(description, ' ', ' ', 'g') text
FROM
test
) t
英文:
You can use regex_replace
for this too:
select id, "SLA text",
case when "SLA text" = 'Yes' then
trim(leading '/' from regexp_replace(text,'(?:.*The average processing time for this type of request is (\d+) days\.)(?:
This delay is reduced to (\d+) day for requests with high or critical priority.)?.*' , '\2/\1', 'g'))
else '' end "SLA2 text"
from(
SELECT
id,
CASE
WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%'
THEN 'Yes'
ELSE 'No'
END AS "SLA text",
regexp_replace(description, ' ', ' ', 'g') text
FROM
test
) t
Here we replace your pattern with found digits from this pattern. And trim leading slash for case when second number is missing.
答案2
得分: 1
您可以使用正则表达式和子字符串功能。
英文:
You can use substirng with regular expressions
SELECT
id,
CASE
WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%'
THEN 'Yes'
ELSE 'No'
END AS "SLA text",
substring(description, '([0-9]*) day') days
FROM
test
where substring(description, '([0-9]*) day') IS NULL
id | SLA text | days |
---|---|---|
1 | No | null |
> ``` status | ||
> SELECT 1 | ||
> ``` |
SELECT
id,
CASE
WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%'
THEN 'Yes'
ELSE 'No'
END AS "SLA text",
substring(description, '([0-9]*) day') days
FROM
test
where substring(description, '([0-9]*) day') IS NULL
UNION ALL
select
id,
MAX(CASE
WHEN regexp_replace(description, ' ', ' ', 'g') ILIKE '%The average processing%'
THEN 'Yes'
ELSE 'No'
END) AS "SLA text",
STRING_AGG(match[1], '/' ORDER BY match[1]) as days
from test
cross join lateral regexp_matches(description, '([0-9]*) day', 'g') as match
Group by id
id | SLA text | days |
---|---|---|
1 | No | null |
2 | Yes | 5 |
3 | Yes | 1/5 |
4 | Yes | 2/7 |
5 | Yes | 3 |
> ``` status | ||
> SELECT 5 | ||
> ``` |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论