正则表达式来查找源和目标

huangapple go评论117阅读模式
英文:

Regular expression to find source and target

问题

需要帮忙编写通用正则表达式以查找以下情景中的源表:

示例1输出:

  1. xyz.test

示例2输出:无输出,因为sample.Test是目标表而不是源表。

我尝试过下面的正则表达式 \b(?!DELETE\s+FROM\s+)(?:FROM|JOIN)\s+(\w+./*[^\s]+),它对第一个情景起作用,但对第二个情景预期没有输出,然而我得到了sample.test。

英文:

Need help to write generic regular expression for to find the source table from the below scenarios:

Sample 1:

  1. INSERT INTO a.test
  2. (
  3. test1,
  4. test2
  5. )
  6. select
  7. test1,
  8. test2
  9. from
  10. xyz.test

Output 1:

  1. xyz.test

Sample 2:

  1. *******************************************************
  2. MACRO
  3. abc.test
  4. PURPOSE
  5. TO DELETE THE DATA FROM abc.test
  6. INPUT PARAMS
  7. NONE
  8. CALLED BY
  9. THIS MACRO IS BEING CALLED FROM sql
  10. *******************************************************/
  11. DELETE FROM sample.test

No output for this as sample.Test is target table not source.

I have tried below regex \b(?!DELETE\s+FROM\s+)(?:FROM|JOIN)\s+(\w+./*[^\s]+) it's working for 1st scenario but for 2nd scenario I was expecting no output however I am getting sample.test

答案1

得分: 2

I really don't think that a regular expression is the safe way to go.
There are really too many things to handle, such as comments, unions, etc.

I had a play with the
SQLGlot library with this code:

  1. ##python3 -m pip install sqlglot
  2. from sqlglot import parse_one, exp
  3. queries = [
  1. INSERT INTO a.test
  2. (
  3. test1,
  4. test2
  5. )
  6. select
  7. test1,
  8. test2
  9. from
  10. xyz.test
  11. Source tables:
  12. ['xyz.test']
  13. /*******************************************************
  14. MACRO
  15. abc.test
  16. PURPOSE
  17. TO DELETE THE DATA FROM abc.test
  18. INPUT PARAMS
  19. NONE
  20. CALLED BY
  21. THIS MACRO IS BEING CALLED FROM sql
  22. *******************************************************/
  23. DELETE FROM sample.test
  24. Source tables:
  25. []
  26. SELECT
  27. t1.firstname,
  28. t1.lastname,
  29. t2.email,
  30. t3.order
  31. FROM abc.table1 t1
  32. LEFT JOIN abc.table2 t2 ON t2.id = t1.id
  33. LEFT JOIN abc.table3 t3 ON t3.email = t2.email
  34. Source tables:
  35. ['abc.table1 AS t1', 'abc.table2 AS t2', 'abc.table3 AS t3']
  36. SELECT id, name FROM table1
  37. UNION
  38. SELECT id, name FROM table2
  39. Source tables:
  40. ['table1', 'table2']
英文:

I really don't think that a regular expression is the safe way to go.
There are really too many things to handle, such as comments, unions, etc.

I had a play with the
SQLGlot library with this code:

  1. ##python3 -m pip install sqlglot
  2. from sqlglot import parse_one, exp
  3. queries = [
  4. """
  5. INSERT INTO a.test
  6. (
  7. test1,
  8. test2
  9. )
  10. select
  11. test1,
  12. test2
  13. from
  14. xyz.test
  15. """,
  16. """
  17. /*******************************************************
  18. MACRO
  19. abc.test
  20. PURPOSE
  21. TO DELETE THE DATA FROM abc.test
  22. INPUT PARAMS
  23. NONE
  24. CALLED BY
  25. THIS MACRO IS BEING CALLED FROM sql
  26. *******************************************************/
  27. DELETE FROM sample.test
  28. """,
  29. """
  30. SELECT
  31. t1.firstname,
  32. t1.lastname,
  33. t2.email,
  34. t3.order
  35. FROM abc.table1 t1
  36. LEFT JOIN abc.table2 t2 ON t2.id = t1.id
  37. LEFT JOIN abc.table3 t3 ON t3.email = t2.email
  38. """,
  39. """
  40. SELECT id, name FROM table1
  41. UNION
  42. SELECT id, name FROM table2
  43. """
  44. ]
  45. for query in queries:
  46. print(query)
  47. sourceTables = []
  48. for select in parse_one(query).find_all(exp.Select):
  49. for table in select.find_all(exp.Table):
  50. sourceTables.append(str(table))
  51. print("Source tables:")
  52. print(sourceTables)
  53. print("\n")

It outputs the following:

  1. INSERT INTO a.test
  2. (
  3. test1,
  4. test2
  5. )
  6. select
  7. test1,
  8. test2
  9. from
  10. xyz.test
  11. Source tables:
  12. ['xyz.test']
  13. /*******************************************************
  14. MACRO
  15. abc.test
  16. PURPOSE
  17. TO DELETE THE DATA FROM abc.test
  18. INPUT PARAMS
  19. NONE
  20. CALLED BY
  21. THIS MACRO IS BEING CALLED FROM sql
  22. *******************************************************/
  23. DELETE FROM sample.test
  24. Source tables:
  25. []
  26. SELECT
  27. t1.firstname,
  28. t1.lastname,
  29. t2.email,
  30. t3.order
  31. FROM abc.table1 t1
  32. LEFT JOIN abc.table2 t2 ON t2.id = t1.id
  33. LEFT JOIN abc.table3 t3 ON t3.email = t2.email
  34. Source tables:
  35. ['abc.table1 AS t1', 'abc.table2 AS t2', 'abc.table3 AS t3']
  36. SELECT id, name FROM table1
  37. UNION
  38. SELECT id, name FROM table2
  39. Source tables:
  40. ['table1', 'table2']

Seems to do what you wanted...

答案2

得分: 0

以下将捕获来自所提供示例的值。

注意,要求“delete”,“from”或“join”是行上第一个非空白字符序列。

  1. (?is)^\s*(?:delete\s+)?(?:from|join)\s*.*?([^\s]+)

(?is) 将启用_忽略大小写_ 和 单行 模式。

输出

  1. xyz.test
  2. sample.test
英文:

The following will capture the values from both of the provided examples.

As a note, a requirement is that the "delete", "from", or "join" are the first non-white-space character sequences on the line.

  1. (?is)^\s*(?:delete\s+)?(?:from|join)\s*.*?([^\s]+)

The (?is) will toggle-on the ignore case, and single-line modes.

Output

  1. xyz.test
  2. sample.test

huangapple
  • 本文由 发表于 2023年6月1日 19:01:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/76381207.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定