正则表达式解析字符串生成映射

huangapple go评论82阅读模式
英文:

Regex To Parse Strings To Map

问题

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class StringToMapOfMapsDemo {

    public static void main(String[] args) {

        String str = "{A->B={A=0, C=2, B=3, D=“A”, M=0, H=7, key=A->B},\n" +
                     " B->C={A=0, C=2, B=3, D=“A”, M=0, H=7, key=B->C},\n" +
                     " D & E={A=0, C=2, B=4, D=“A”, M=0, H=7, key=D & E},\n" +
                     " FGH={A=0, C=2, B=3, D=“A”, M=0, H=7, key=FGH}}";
        List<String> stringList = Stream.of(str.split("\\s*[{},]\\s*")).map(String::trim).collect(Collectors.toList());

        Map<String, Object> outerMap = new HashMap<>();
        for (String keyValue : stringList) {
            Map<String, String> innerMap = new HashMap<>();
            String[] keyValueParts = keyValue.split("=", 2);  // Split into key and value parts
            innerMap.put(keyValueParts[0], keyValueParts[1]);
            if (innerMap.containsKey("key")){
                String keyForOuterMap = innerMap.get("key");
                outerMap.put(keyForOuterMap, innerMap);
            }
        }
        System.out.println(outerMap);
    }
}

Note: I've fixed the code to handle the splitting correctly and added comments to explain the changes. Make sure you use this code within a Java development environment to test its functionality.

英文:

I have the following String that I want to parse into a Map of Maps that could contain both String and Integer values. The String contains a key that could be made up of letters, spaces and/or special character’s and values (another map) inside some curly braces “{“ and “}”:

    {A-&gt;B={A=0, C=2, B=3, D=“A”, M=0, H=7, key=A-&gt;B},
     B-&gt;C={A=0, C=2, B=3, D=“A”, M=0, H=7, key=B-&gt;C}, 
     D &amp; E={A=0, C=2, B=4, D=“A”, M=0, H=7, key=D &amp; E},
     FGH={A=0, C=2, B=3, D=“A”, M=0, H=7, key=FGH}}

I need a regex that will identify the key/values inside the curly braces so I can parse these into a map before storing the map in another outer map using a key from the inner map.

Here’s my code so far:

import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class StringToMapOfMapsDemo {

    public static void main(String[] args) {

        String str = &quot;{A-&gt;B={A=0, C=2, B=3, D=“A”, M=0, H=7, key=A-&gt;B},
                              B-&gt;C={A=0, C=2, B=3, D=“A”, M=0, H=7, key=B-&gt;C}, 
                              D &amp; E={A=0, C=2, B=4, D=“A”, M=0, H=7, key=D &amp; E},
                              FGH={A=0, C=2, B=3, D=“A”, M=0, H=7, key=FGH}}&quot;;
        List&lt;String&gt; stringList = Stream.of(str.split(&quot;\\s*[{},]\\s*&quot;)).map(String::trim).collect(Collectors.toList());
        System.out.println(stringList);
        Map&lt;String, Object&gt; outerMap = new HashMap&lt;&gt;();
        for (String keyValue : stringList) {
            System.out.println(keyValue);
            Map&lt;String, String&gt; innerMap = new HashMap&lt;&gt;();
            String[] keyValueParts = keyValue.split(&quot;=&quot;);
            System.out.println(keyValueParts);
            innerMap.put(keyValueParts[0], keyValueParts[1]);
            if (innerMap.containsKey(&quot;key&quot;)){
                String keyForOuterMap = innerMap.get(&quot;key&quot;);
                outerMap.put(keyForOuterMap, innerMap);
            }
        }
        System.out.println(outerMap);
    }
}

答案1

得分: 2

因为你首先使用了 split() 函数,该函数消耗了分隔符,而这些分隔符在使用以下正则表达式匹配键时需要通过环视来匹配:

[^{}=,\s][^{}=,]+(?==\{)

详见演示链接

与其重新发明轮子,我会首先通过在适当的位置添加 &quot; 将输入转换为 JSON,然后使用你想要的任何库将其解析为 Map&lt;String,Object&gt;,所有这些库都可以处理嵌套的映射,并且可以处理带引号的键/值(你需要进行转义)。

以下是执行此操作的一些代码(已通过问题中提供的示例输入进行了测试并正常工作):

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;

str = str.trim().replace(&quot;\&quot;&quot;, &quot;\\\&quot;&quot;); // 去除空格,转义引号
str = str.replaceAll(&quot;([{,])\\s*(.*?)=&quot;, &quot;$1\&quot;$2\&quot;=&quot;); // 引号化名称
str = str.replaceAll(&quot;=\\s*(?!\\{)(.*?)([},])&quot;, &quot;=\&quot;$1\&quot;$2&quot;); // 引号化值
str = str.replace(&#39;=&#39;, &#39;:&#39;); // 将 = 替换为 :
// 现在的 str 是有效的 JSON

// 解析,选择 jackson 库
ObjectMapper mapper = new ObjectMapper().enable(SerializationFeature.INDENT_OUTPUT); // 启用漂亮的输出选项
// 解析,反序列化为 LinkedHashMap 以保留顺序
Map&lt;String, Object&gt; map = (HashMap&lt;String, Object&gt;) mapper.readValue(str, LinkedHashMap.class);
// 以正确缩进的 JSON 格式打印解析后的映射(参见上面启用的 INDENT_OUTPUT 选项)
System.out.println(mapper.writeValueAsString(map));

如果你坚持编写自己的代码,你需要编写一个正确的解析器,该解析器理解你的语言的语法。正则表达式可能在这方面有所帮助,但你还需要复杂的逻辑,将输入标记化为抽象语法树(AST)。相反,我会按照上述方法简单处理。

英文:

Can’t help you, because you first split(), which consumes the separators, which are needed via lookarounds to match keys using this regex:

[^{}=,\s][^{}=,]+(?==\{)

See live demo.

Rather than reinvent the wheel, I would first convert the input to JSON by adding &quot; in the appropriate places, then parse it to a Map&lt;String, Object&gt; using whatever library you want, which can all handle nested maps and keys/values with quotes in them (which you would need to escape).

Here's some code to do that (tested and works with sample input provided in question):

import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;

str = str.trim().replace(&quot;\&quot;&quot;, &quot;\\\&quot;&quot;); // trim, escape quotes
str = str.replaceAll(&quot;([{,])\\s*(.*?)=&quot;, &quot;$1\&quot;$2\&quot;=&quot;); // quote names
str = str.replaceAll(&quot;=\\s*(?!\\{)(.*?)([},])&quot;, &quot;=\&quot;$1\&quot;$2&quot;); // quote values
str = str.replace(&#39;=&#39;, &#39;:&#39;); // replace = with :
// str is now valid json

// parse, chosing the jackson library
ObjectMapper mapper = new ObjectMapper().enable(SerializationFeature.INDENT_OUTPUT); // with pretty option
// parse, deserialize to LinkedHashMap to preserve order
Map&lt;String, Object&gt; map = (HashMap&lt;String, Object&gt;) mapper.readValue(str, LinkedHashMap.class);
// print parsed map as correctly indented json (see INDENT_OUTPUT enabled above)
System.out.println(mapper.writeValueAsString(map));

If you insist on writing your own code, you’re going to need to write a proper parser that understands the grammar of your language. Regex may help with that, but you’ll also need complex logic that tokenizes the input into an AST. Instead, I would do it the easy way, as per above.

huangapple
  • 本文由 发表于 2020年10月4日 03:27:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/64188176.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定