英文:
Regular expression to find source and target
问题
需要帮忙编写通用正则表达式以查找以下情景中的源表:
示例1输出:
xyz.test
示例2输出:无输出,因为sample.Test是目标表而不是源表。
我尝试过下面的正则表达式 \b(?!DELETE\s+FROM\s+)(?:FROM|JOIN)\s+(\w+./*[^\s]+)
,它对第一个情景起作用,但对第二个情景预期没有输出,然而我得到了sample.test。
英文:
Need help to write generic regular expression for to find the source table from the below scenarios:
Sample 1:
INSERT INTO a.test
(
test1,
test2
)
select
test1,
test2
from
xyz.test
Output 1:
xyz.test
Sample 2:
*******************************************************
MACRO
abc.test
PURPOSE
TO DELETE THE DATA FROM abc.test
INPUT PARAMS
NONE
CALLED BY
THIS MACRO IS BEING CALLED FROM sql
*******************************************************/
DELETE FROM sample.test
No output for this as sample.Test is target table not source.
I have tried below regex \b(?!DELETE\s+FROM\s+)(?:FROM|JOIN)\s+(\w+./*[^\s]+)
it's working for 1st scenario but for 2nd scenario I was expecting no output however I am getting sample.test
答案1
得分: 2
I really don't think that a regular expression is the safe way to go.
There are really too many things to handle, such as comments, unions, etc.
I had a play with the
SQLGlot library with this code:
##python3 -m pip install sqlglot
from sqlglot import parse_one, exp
queries = [
INSERT INTO a.test
(
test1,
test2
)
select
test1,
test2
from
xyz.test
Source tables:
['xyz.test']
/*******************************************************
MACRO
abc.test
PURPOSE
TO DELETE THE DATA FROM abc.test
INPUT PARAMS
NONE
CALLED BY
THIS MACRO IS BEING CALLED FROM sql
*******************************************************/
DELETE FROM sample.test
Source tables:
[]
SELECT
t1.firstname,
t1.lastname,
t2.email,
t3.order
FROM abc.table1 t1
LEFT JOIN abc.table2 t2 ON t2.id = t1.id
LEFT JOIN abc.table3 t3 ON t3.email = t2.email
Source tables:
['abc.table1 AS t1', 'abc.table2 AS t2', 'abc.table3 AS t3']
SELECT id, name FROM table1
UNION
SELECT id, name FROM table2
Source tables:
['table1', 'table2']
英文:
I really don't think that a regular expression is the safe way to go.
There are really too many things to handle, such as comments, unions, etc.
I had a play with the
SQLGlot library with this code:
##python3 -m pip install sqlglot
from sqlglot import parse_one, exp
queries = [
"""
INSERT INTO a.test
(
test1,
test2
)
select
test1,
test2
from
xyz.test
""",
"""
/*******************************************************
MACRO
abc.test
PURPOSE
TO DELETE THE DATA FROM abc.test
INPUT PARAMS
NONE
CALLED BY
THIS MACRO IS BEING CALLED FROM sql
*******************************************************/
DELETE FROM sample.test
""",
"""
SELECT
t1.firstname,
t1.lastname,
t2.email,
t3.order
FROM abc.table1 t1
LEFT JOIN abc.table2 t2 ON t2.id = t1.id
LEFT JOIN abc.table3 t3 ON t3.email = t2.email
""",
"""
SELECT id, name FROM table1
UNION
SELECT id, name FROM table2
"""
]
for query in queries:
print(query)
sourceTables = []
for select in parse_one(query).find_all(exp.Select):
for table in select.find_all(exp.Table):
sourceTables.append(str(table))
print("Source tables:")
print(sourceTables)
print("\n")
It outputs the following:
INSERT INTO a.test
(
test1,
test2
)
select
test1,
test2
from
xyz.test
Source tables:
['xyz.test']
/*******************************************************
MACRO
abc.test
PURPOSE
TO DELETE THE DATA FROM abc.test
INPUT PARAMS
NONE
CALLED BY
THIS MACRO IS BEING CALLED FROM sql
*******************************************************/
DELETE FROM sample.test
Source tables:
[]
SELECT
t1.firstname,
t1.lastname,
t2.email,
t3.order
FROM abc.table1 t1
LEFT JOIN abc.table2 t2 ON t2.id = t1.id
LEFT JOIN abc.table3 t3 ON t3.email = t2.email
Source tables:
['abc.table1 AS t1', 'abc.table2 AS t2', 'abc.table3 AS t3']
SELECT id, name FROM table1
UNION
SELECT id, name FROM table2
Source tables:
['table1', 'table2']
Seems to do what you wanted...
答案2
得分: 0
以下将捕获来自所提供示例的值。
注意,要求“delete”,“from”或“join”是行上第一个非空白字符序列。
(?is)^\s*(?:delete\s+)?(?:from|join)\s*.*?([^\s]+)
(?is) 将启用_忽略大小写_ 和 单行 模式。
输出
xyz.test
sample.test
英文:
The following will capture the values from both of the provided examples.
As a note, a requirement is that the "delete", "from", or "join" are the first non-white-space character sequences on the line.
(?is)^\s*(?:delete\s+)?(?:from|join)\s*.*?([^\s]+)
The (?is) will toggle-on the ignore case, and single-line modes.
Output
xyz.test
sample.test
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论