使用Java的split方法来获取SQL语句中的表名

huangapple go评论71阅读模式
英文:

Using Java split to get table names in an SQL statement

问题

以下是翻译好的部分:

在以下代码中,我需要使用SQL保留字作为分隔符来获取表名作为标记。这些标记应包含表名或表名后跟一个后缀。

例如,给定以下内容:

table1 t1 inner join table2 t2 outer join table3

该代码应返回三个标记:

标记 1:table1 t1
标记 2:table2 t2
标记 3:table3

而这段代码实际上使用第一个保留字作为标记,而不会丢弃任何后续的保留字:

String str = "table1 t1 inner join table2 t2 outer join table3";
String[] tokens = sql2.split("\\son\\s|\\sinner\\s|\\souter\\s|\\sjoin\\s");

for (int i = 0; i < tokens.length; i++)
    System.out.println("Token " + (i + 1) + ":" + tokens[i]);

这将返回:

标记 1:table1 t1
标记 2:join table2 t2
标记 3:join table3

问题是什么,如何解决这个问题?

英文:

In the following code I need to get table names as tokens using SQL reserved words as separators. The tokens should contain the table name or the table name followed by a suffix.

For example, given

table1 t1 inner join table2 t2 outer join table3

The code should return three tokens:

Token 1: table1 t1
Token 2: table2 t2 
Token 3: table3

This code instead uses the first reserved word as token, without discarding any other following reserved words:

	String str = &quot;table1 t1 inner join table2 t2 outer join table3&quot;;
	String [] tokens = sql2.split(&quot;\\son\\s|\\sinner\\s|\\souter\\s|\\sjoin\\s&quot;);
	
	for (int i = 0; i&lt;tokens.length; i++)
         System.out.println(&quot;Token &quot;+(i+1)+&quot;:&quot;+tokens[i]);

This returns:

Token 1:table1 t1
Token 2:join table2 t2 
Token 3:join table3

What is the problem and how to make this work?

答案1

得分: 2

这适用于一系列连接表的通用情况:

String[] tableNames = str.split(" (?=inner|outer|left|join|right|cross|full).*?join ");
英文:

This works for the general case for a series of joined tables:

String[] tableNames = str.split(&quot; (?=inner|outer|left|join|right|cross|full).*?join &quot;);

答案2

得分: 1

以下是已翻译的内容:

你可以使用 (?:\w+\s+){1,2}(?=inner|outer join)|(?<=inner join)(?:\s+\w+){1,2}|(?<=outer join)(?:\s+\w+){1,2} 作为 正则表达式

演示:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) {
        // Test strings
        String[] arr = { "table1 t1 inner join table2 t2 outer join table3",
                        "table1 t1 inner join table2 t2 outer join table3 t3 inner join table2" };
        Pattern pattern = Pattern.compile(
                "(?:\\w+\\s+){1,2}(?=inner|outer join)|(?<=inner join)(?:\\s+\\w+){1,2}|(?<=outer join)(?:\\s+\\w+){1,2}");
        for (String s : arr) {
            System.out.println("Processing: " + s);
            Matcher matcher = pattern.matcher(s);
            while (matcher.find()) {
                System.out.println(matcher.group().trim());
            }
        }
    }
}

输出:

Processing: table1 t1 inner join table2 t2 outer join table3
table1 t1
table2 t2
table3
Processing: table1 t1 inner join table2 t2 outer join table3 t3 inner join table2
table1 t1
table2 t2
table3 t3
table2
英文:

You can use (?:\w+\s+){1,2}(?=inner|outer join)|(?&lt;=inner join)(?:\s+\w+){1,2}|(?&lt;=outer join)(?:\s+\w+){1,2} as the regex.

Demo:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
	public static void main(String[] args) {
		// Test strings
		String[] arr = { &quot;table1 t1 inner join table2 t2 outer join table3&quot;,
				&quot;table1 t1 inner join table2 t2 outer join table3 t3 inner join table2&quot; };
		Pattern pattern = Pattern.compile(
				&quot;(?:\\w+\\s+){1,2}(?=inner|outer join)|(?&lt;=inner join)(?:\\s+\\w+){1,2}|(?&lt;=outer join)(?:\\s+\\w+){1,2}&quot;);
		for (String s : arr) {
			System.out.println(&quot;Processing: &quot; + s);
			Matcher matcher = pattern.matcher(s);
			while (matcher.find()) {
				System.out.println(matcher.group().trim());
			}
		}
	}
}

Output:

Processing: table1 t1 inner join table2 t2 outer join table3
table1 t1
table2 t2
table3
Processing: table1 t1 inner join table2 t2 outer join table3 t3 inner join table2
table1 t1
table2 t2
table3 t3
table2

答案3

得分: 0

I've translated the content you provided:

我已经测试了您的模式(未转义,它是\son\s|\sinner\s|\souter\s|\sjoin\s)与此测试字符串:table1 t1 inner join table2 t2 outer join table3regex101.com上,我唯一得到的匹配是innerouter。因此,由于您正在使用这些标记拆分字符串,您获得了您的结果。

也许这可以帮助您解决特定情况。我选择了正则表达式方法,而不是拆分数据的方法。

public class PatternChecker {

    public static void main(String[] args) {
        String str = "table1 t1 inner join table2 t2 outer join table3";
        Pattern p = Pattern.compile("(table[0-9]+( [a-zA-Z0-9]+ )?)");
        Matcher m = p.matcher(str);

        while (m.find()) {
            System.out.println(m.group(0));
        }
    }

}

稍后编辑

拆分模式\\son\\s|\\sinner\\s|\\souter\\s|\\sjoin\\s由于使用了必需的空格而无法工作。
例如,您正在搜索*on**inner**outer**join*(空格用星号标记)。空格是您正在拆分的关键字的一部分。无法匹配*join*,因为其左侧空格已被*outer**inner*右侧空格匹配中的一个捕获。

回到拆分解决方案,一个修复方法是通过?定量器将join的左侧空格标记为可选;这将是新模式:\\son\\s|\\sinner\\s|\\souter\\s|\\s?join\\s。这会生成一些可以过滤掉的空令牌。

另一个想法是将使用join的聚合(即inner join,outer join)视为完整的搜索条件,这将导致\\son\\s|\\sinner join\\s|\\souter join\\s。不会生成空令牌。

public class PatternChecker {

    public static void main(String[] args) {
        String str = "employee t1 inner join department t2 outer join job join table4 history on a=b";

        String[] tokens = str.split("\\son\\s|\\sinner join\\s|\\souter join\\s|\\sjoin\\s");

        for (String token : tokens) {
            System.out.println(token);
        }

        // 输出
        // employee t1
        // department t2
        // job
        // table4 history
        // a=b

    }

}

请注意,由于您还包括on,您可以过滤掉包含等号符号的所有匹配令牌。

对于通用修复,您需要隔离在fromwhere之间的字符串,并应用上述思想。

英文:

I've tested your pattern (unescaped, it's \son\s|\sinner\s|\souter\s|\sjoin\s) against this test string: table1 t1 inner join table2 t2 outer join table3 on regex101.com and the only match I got is for inner and outer. So since you're splitting the string by these tokens, you get your result.

Perhaps this can help you for your specific case. I have went for a regex approach, instead of splitting the data.

public class PatternChecker {

    public static void main(String[] args) {
        String str = &quot;table1 t1 inner join table2 t2 outer join table3&quot;;
        Pattern p = Pattern.compile(&quot;(table[0-9]+( [a-zA-Z0-9]+ )?)&quot;);
        Matcher m = p.matcher(str);

        while(m.find()) {
            System.out.println(m.group(0));
        }
    }

}

Later edit

The split pattern \\son\\s|\\sinner\\s|\\souter\\s|\\sjoin\\s did not work because of the mandatory whitespaces used.
For instance, you are searching for *on* or *inner* or *outer* or *join* (whitespaces are marked with an asterisk). The whitespaces are part of the keywords you're splitting with. *join* could not be matched since its left-side whitespace was already picked up by the *outer* and *inner* right-side whitespace matches.

Going back to the split solution, one fix would be to mark the left-side whitespace of join as optional via the ? quantifier; this would be the new pattern: \\son\\s|\\sinner\\s|\\souter\\s|\\s?join\\s. This yields some empty tokens that can be filtered out

Another idea would be to consider aggregations using join (i.e. inner join, outer join) as full search criteria, which would lead to \\son\\s|\\sinner join\\s|\\souter join\\s. No empty tokens are generated.

public class PatternChecker {

    public static void main(String[] args) {
        String str = &quot;employee t1 inner join department t2 outer join job join table4 history on a=b&quot;;

        String[] tokens = str.split(&quot;\\son\\s|\\sinner join\\s|\\souter join\\s|\\sjoin\\s&quot;);

        for(String token : tokens) {
            System.out.println(token);
        }

        // Output
        // employee t1
        // department t2
        // job
        // table4 history
        // a=b

    }

}

Note that, since you're also including on, you can filter out all the matched tokens containing the equals symbol.

For a generic fix, you would need to isolate the string contained between from and where and apply the idea above.


答案4

得分: 0

我找到了这个与您的问题本质相同的问题。

您可以通过经历一个正则表达式的噩梦来解决这个问题,或者您可以尝试使用外部库,如Zql,它实际上会为您解析SQL语句。

英文:

I found this question that is essentially the same as yours.

You can mostly solve this problem by going through a regex nightmare, or you can try to use an external library like Zql which actually parses the SQL statement for you.

huangapple
  • 本文由 发表于 2020年8月1日 04:52:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/63198950.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定