英文:
Regex capturing empty string along with the expected groups, want it to not to capture empty string
问题
I built a regex to capture the value from the pattern, where pattern is to identify the json and fetch value from it. But along with the expected groups, it is also capturing the empty strings in the group.
Regex:
(?<=((?i)(finInstKey)"):)["]?(.*?)(?=["|,|}])|(?<="((?i)finInstKey","value":)["]?)(.*?)(?=["|,|}])
input:
- {"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}
- {finInstKey":"500"},{"name":"finInstKey","value":"12345678900987654321"}
for these inputs, input 2 also captures the empty string along with the expected values.
actual output:
500
12345678900987654321
500
12345678900987654321
expected output:
500
12345678900987654321
500
12345678900987654321
As of now, I have handled it manually in the Java code, but it would be nice if regex won't capture the empty strings.
what changes should I make in the regex to get expected output.
Mainly, I want this to replaceAll groups with masked value "****".
My piece of code:
public class RegexTester {
private static final String regex = "(?<=((?i)(%s)\":))[\"]?(.*?)(?=[\"|,|}])|(?<=\"((?i)%s\",\"value\":)[\"]?)(.*?)(?=[\"|,|}])";
public static void main(String[] args) {
String field = "finInstKey";
String input = "{\"finInstKey\":500},{\"name\":\"finInstKey\",\"value\":12345678900987654321}{finInstKey\":\"500\"},{\"name\":\"finInstKey\",\"value\":\"12345678900987654321\"}";
try {
Pattern pattern = Pattern.compile(String.format(regex, field, field));
Matcher matcher = pattern.matcher(input);
// System.out.println(matcher.replaceAll("****"));
while (matcher.find()) {
System.out.println(matcher.group());
}
} catch (Exception e) {
System.err.println(e);
}
}
}
英文:
I built a regex to capture the value from the pattern, where pattern is to identify the json and fetch value from it. But along with the expected groups, it is also capturing the empty strings in the group.
Regex:
(?<=((?i)(finInstKey)":)["]?)(.*?)(?=["|,|}])|(?<="((?i)finInstKey","value":)["]?)(.*?)(?=["|,|}])
input:
- {"finInstKey":500},{"name":"finInstKey","value":12345678900987654321}
- {finInstKey":"500"},{"name":"finInstKey","value":"12345678900987654321"}
for these inputs, input 2 also captures the empty string along with the expected values.
actual output:
500
12345678900987654321
500
12345678900987654321
expected output:
500
12345678900987654321
500
12345678900987654321
As of now, I have handled it manually in the Java code, but it would be nice if regex won't capture the empty strings.
what changes should I make in the regex to get expected output.
Mainly, I want this to replaceAll groups with masked value "****".
My piece of code:
public class RegexTester {
private static final String regex = "(?<=((?i)(%s)\":)[\"]?)(.*?)(?=[\"|,|}])|(?<=\"((?i)%s\",\"value\":)[\"]?)(.*?)(?=[\"|,|}])";
public static void main(String[] args) {
String field = "finInstKey";
String input = "{\"finInstKey\":500},{\"name\":\"finInstKey\",\"value\":12345678900987654321}{finInstKey\":\"500\"},{\"name\":\"finInstKey\",\"value\":\"12345678900987654321\"}";
try {
Pattern pattern = Pattern.compile(String.format(regex, field, field));
Matcher matcher = pattern.matcher(input);
// System.out.println(matcher.replaceAll("****"));
while (matcher.find()) {
System.out.println(matcher.group());
}
} catch (Exception e) {
System.err.println(e);
}
}
}
答案1
得分: 3
使用JSON解析库来解析JSON可能会更容易,而不是使用正则表达式。
尝试使用https://github.com/google/gson中的.fromJSON
方法。
如果您坚持使用正则表达式,也许可以研究一下正则表达式中的+
符号,它表示"匹配一个或多个"。当正则表达式变得像您那样复杂时,阅读起来相当困难。
英文:
It'd probably be easier using a JSON parsing library to parse JSON, instead of regex.
Try the .fromJSON
method from https://github.com/google/gson
If you insist on using regex, maybe look into the +
symbol in regex, it means "match one or more". Regex is pretty difficult to read when it gets complicated like you have there.
答案2
得分: 0
finInstKey
键没有用引号括起来,导致匹配结果为空。通过将模式更改为 "finInstKey"
,您将允许它匹配这个输入并正确提取值。
像这样使用它
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String field = "finInstKey";
String regex = "\"?" + field + "\"?(\\s*:\\s*\"?([^\",}]*)\"?|\",\"value\"\\s*:\\s*\"?([^\",}]*)\"?)";
String input = "{\"finInstKey\":500},{\"name\":\"finInstKey\",\"value\":12345678900987654321}{finInstKey:\"500\"},{\"name\":\"finInstKey\",\"value\":\"12345678900987654321\"}";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
if (matcher.group(2) != null) {
System.out.println(matcher.group(2));
} else {
System.out.println(matcher.group(3));
}
}
}
}
这里是代码。
英文:
The finInstKey
key is not enclosed in quotes leading to empty matches. By changing the pattern to "finInstKey"
you will allow it to match this input and correctly extract the value.
Use it like
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Main {
public static void main(String[] args) {
String field = "finInstKey";
String regex = "\"?" + field + "\"?(\\s*:\\s*\"?([^\",}]*)\"?|\",\"value\"\\s*:\\s*\"?([^\",}]*)\"?)";
String input = "{\"finInstKey\":500},{\"name\":\"finInstKey\",\"value\":12345678900987654321}{finInstKey:\"500\"},{\"name\":\"finInstKey\",\"value\":\"12345678900987654321\"}";
Pattern pattern = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
if (matcher.group(2) != null) {
System.out.println(matcher.group(2));
} else {
System.out.println(matcher.group(3));
}
}
}
}
here is the code
答案3
得分: 0
以下是翻译好的部分:
你可以使用以下模式。 捕获组为2和3。
考虑到文本值可能包含任何可能的分隔符,确定值的结尾并不容易。
确保你的数据会符合要求;这意味着它只是一系列数字。
尽管如此,我建议只使用一个_JSON_解析模块,_Google_的 _Gson_ 效果很好。
你的_JSON_字符串实际上是数组,所以只需将每个放在方括号中。
请注意,你的第二个示例中的_finInstKey_键缺少引号。
使用_Gson_,你可以利用_JsonParser_类来解析_values_。
输出
```none
finInstKeyA = 500
valueA = 12345678900987654321
finInstKeyB = 500
valueB = 12345678900987654321
英文:
You can use the following pattern. The capture groups are 2 and 3.
It's not easy to determine the end of the value, considering a text value may contain any of the possible delimiters.
Assure that your data will conform; this implies that it's just a series of numbers.
(?si)(\"finInstKey\")\s*:\s*\"?(.+?)\b.+?\"name\"\s*:\s*\1\s*,\s*\"value\"\s*:\s*\"?(.+?)\b
Although, I recommend just using a JSON parsing module, Gson by Google works well.
You're JSON strings are actually arrays, so just place each within square brackets.
[
{
"finInstKey": 500
},
{
"name": "finInstKey",
"value": 12345678900987654321
}
]
Note that your second example has a missing quotation mark for the finInstKey key.
[
{
"finInstKey": "500"
},
{
"name": "finInstKey",
"value": "12345678900987654321"
}
]
With Gson you can utilize the JsonParser class to parse the values.
String stringA = "[\n" +
" {\n" +
" \"finInstKey\": 500\n" +
" },\n" +
" {\n" +
" \"name\": \"finInstKey\",\n" +
" \"value\": 12345678900987654321\n" +
" }\n" +
"]";
String stringB = "[\n" +
" {\n" +
" \"finInstKey\": \"500\"\n" +
" },\n" +
" {\n" +
" \"name\": \"finInstKey\",\n" +
" \"value\": \"12345678900987654321\"\n" +
" }\n" +
"]";
JsonArray arrayA = JsonParser.parseString(stringA).getAsJsonArray();
JsonObject objectA1 = arrayA.get(0).getAsJsonObject();
JsonElement elementA1 = objectA1.get("finInstKey");
int finInstKeyA = elementA1.getAsInt();
JsonObject objectA2 = arrayA.get(1).getAsJsonObject();
JsonElement elementA2 = objectA2.get("value");
BigInteger valueA = elementA2.getAsBigInteger();
System.out.println("finInstKeyA = " + finInstKeyA);
System.out.println("valueA = " + valueA);
JsonArray arrayB = JsonParser.parseString(stringB).getAsJsonArray();
JsonObject objectB1 = arrayB.get(0).getAsJsonObject();
JsonElement elementB1 = objectB1.get("finInstKey");
String finInstKeyB = elementB1.getAsString();
JsonObject objectB2 = arrayB.get(1).getAsJsonObject();
JsonElement elementB2 = objectB2.get("value");
String valueB = elementB2.getAsString();
System.out.println("finInstKeyB = " + finInstKeyB);
System.out.println("valueB = " + valueB);
Output
finInstKeyA = 500
valueA = 12345678900987654321
finInstKeyB = 500
valueB = 12345678900987654321
答案4
得分: 0
I think you use not correct regexp.
public static List<String> getData(String str, String field) {
String regex = "(?:\"?" + field + "\"?:(\\d+))|(?:\"name\":\"" + field + "\",\"value\":\"?(\\d+))\"";
Matcher matcher = Pattern.compile(regex).matcher(str);
List<String> data = new ArrayList<>();
while (matcher.find()) {
data.add(Optional.ofNullable(matcher.group(1))
.orElseGet(() -> matcher.group(2)));
}
return data;
}
Output:
500
12345678900987654321
500
12345678900987654321
P.S. 我认为使用正则表达式解析 JSON 是一个战略性不好的想法。我建议您使用任何 JSON 解析器(如 Jackson、Gson 等)。
英文:
I think you use not correct regexp.
public static List<String> getData(String str, String field) {
String regex = "(?:\"?" + field + "\"?:\"?(\\d+)\"?)|(?:\"name\":\""
+ field + "\",\"value\":\"?(\\d+)\"?)";
Matcher matcher = Pattern.compile(regex).matcher(str);
List<String> data = new ArrayList<>();
while (matcher.find()) {
data.add(Optional.ofNullable(matcher.group(1))
.orElseGet(() -> matcher.group(2)));
}
return data;
}
Output:
500
12345678900987654321
500
12345678900987654321
P.S. I think parsing json with regexpis a strategically bad idea. I recommend you to use any Json parser (Jackson, Gson, ...)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论