你可以创建一个子字符串位置的索引地图或列表 – Java 怎么做?

huangapple go评论75阅读模式
英文:

How can you create an map or list of indexes of a substring position-Java?

问题

我正在从文本文件中解析许多行。文件行的固定长度为宽度,但取决于行开头,例如“0301....”,文件数据结构会有所不同。有些行的开头是11、34等,基于此,行的分割方式也不同。

例如:如果行的开头包含“03”,那么行将按以下方式分割:

name = line.substring(2, 10);
surname = line.substring(11, 21);
id = line.substring(22, 34);
address = line.substring(35, 46);

另一个例子:如果行的开头包含“24”,那么行将按以下方式分割:

name = line.substring(5, 15);
salary = line.substring(35, 51);
empid = line.substring(22, 34);
department = line.substring(35, 46);

因此,我有许多子字符串添加到许多字符串中,然后写入新的CSV文件。

我的问题是,是否有任何简单的方法来存储子字符串的坐标(索引),以便以后更容易地调用它们?例如:

name = (2, 10);
surname = (11, 21);

...
等等。

或者可能有其他替代使用子字符串的方法吗?谢谢!

英文:

I am parsing many lines from a text file. The file lines are fixed length width but depending on beginning of the line ex "0301...." the file data structure is split. there are lines example beginning with 11, 34 etc, and based on that the line is split differently.

Example: if start of line contains "03", then the line would be split on

name = line.substring(2, 10);
surname = line.substring(11, 21);
id = line.substring(22, 34);
adress = line.substring (35, 46); 

Another Example: if start of line contains "24", then the line would be split on

name = line.substring(5, 15);
salary = line.substring(35, 51);
empid = line.substring(22, 34);
department = line.substring (35, 46); 

So I have many substrings are added to many strings, then written to a new file in csv.

My question would be is there any easy method for storing the coordinates (indexes) of a substring and calling them later easier? Example

name = (2,10);
surname = (11,21);

...
etc.

Or probably any alternative of using substrings? thank you!

答案1

得分: 1

创建一个名为 Line 的类,并且存储这些对象而不是字符串:

class Line {

  int[] name;
  int[] surname;
  int[] id;
  int[] address;

  String line;

  public Line(String line) {
    this.line = line;

    String startCode = line.substring(0, 3);
    switch(startCode) {
      case "03":
        this.name = new int[]{2, 10};
        this.surname = new int[]{11, 21};
        this.id = new int[]{22, 34};
        this.address = new int[]{35, 46};
        break;
      case "24":
        // 使用不同的索引进行相同的操作
        break;
      // 添加更多的情况
    }
  }

  public String getName() {
    return this.line.substring(this.name[0], this.name[1]);
  }

  public String getSurname() {
    return this.line.substring(this.surname[0], this.surname[1]);
  }

  public String getId() {
    return this.line.substring(this.id[0], this.id[1]);
  }

  public String getAddress() {
    return this.line.substring(this.address[0], this.address[1]);
  }
}

然后:

String line = "03 ...";

Line parsed = new Line(line);
parsed.getName();
parsed.getSurname();
// ...

如果你要从 Line 对象中多次检索 namesurname 等等,你甚至可以在第一次调用时将其缓存起来,这样你就不会多次调用 substring

英文:

Create a class called Line and store these objects rather than the string:

class Line {

  int[] name;
  int[] surname;
  int[] id;
  int[] address;

  String line;

  public Line(String line) {
    this.line = line;

    String startCode = line.substring(0, 3);
    switch(startCode) {
      case "03":
        this.name = new int[]{2, 10};
        this.surname = new int[]{11, 21};
        this.id = new int[]{22, 34};
        this.address = new int[]{35, 46};
        break;
      case "24":
        // same thing with different indices
        break;
      // add more cases
    }
  }

  public String getName() {
    return this.line.substring(this.name[0], this.name[1]);
  }

  public String getSurname() {
    return this.line.substring(this.surname[0], this.surname[1]);
  }

  public String getId() {
    return this.line.substring(this.id[0], this.id[1]);
  }

  public String getAddress() {
    return this.line.substring(this.address[0], this.address[1]);
  }
}

Then:

String line = "03 ..."

Line parsed = new Line(line);
parsed.getName();
parsed.getSurname();
...

If you're going to retrieve the name, surname etc. multiple times from the Line object, you can even cache it the first time so that you're not calling substring multiple times

答案2

得分: 1

你可以尝试类似这样的代码。我会将边界检查和优化留给你,作为第一次尝试...

public static void main(String[] args) {

    Map<String, Map<String, IndexDesignation>> substringMapping = new HashMap<>();

    // 将所有映射的指示放在这里

    substringMapping.put("03", new HashMap<>());
    substringMapping.get("03").put("name", new IndexDesignation(2, 10));
    substringMapping.get("03").put("surname", new IndexDesignation(11, 21));

    // 这确定使用哪个映射值
    Map<String, IndexDesignation> indexDesignationMap = substringMapping.get(args[0].substring(0, 2));

    // 这保存结果
    Map<String, String> resultsMap = new HashMap<>();

    // 确保我们实际上有一个要使用的映射
    if (indexDesignationMap != null) {
        // 现在将这个特定的映射指示转换成结果映射的名字到值的映射

        for (Map.Entry<String, IndexDesignation> mapEntry : indexDesignationMap.entrySet()) {
            resultsMap.put(mapEntry.getKey(), args[0].substring(mapEntry.getValue().startIndex,
                    mapEntry.getValue().endIndex));
        }
    }

    // 打印结果(你也可以根据需要分配给另一个对象)
    System.out.println(resultsMap);
}

// 也可以只使用两个元素的列表,而不是这个类
static class IndexDesignation {
    int startIndex;
    int endIndex;

    public IndexDesignation(int startIndex, int endIndex) {
        this.startIndex = startIndex;
        this.endIndex = endIndex;
    }
}

(注意:此处只返回了代码的翻译部分,不包含其他内容。)

英文:

You could try something like this. I'll leave the bounds checking and optimization to you, but as a first pass...

public static void main( String[] args ) {
Map&lt;String, Map&lt;String,IndexDesignation&gt;&gt; substringMapping = new HashMap&lt;&gt;();
// Put all the designations of how to map here
substringMapping.put( &quot;03&quot;, new HashMap&lt;&gt;());
substringMapping.get( &quot;03&quot; ).put( &quot;name&quot;, new IndexDesignation(2,10));
substringMapping.get( &quot;03&quot; ).put( &quot;surname&quot;, new IndexDesignation(11,21));
// This determines which mapping value to use
Map&lt;String,IndexDesignation&gt; indexDesignationMap = substringMapping.get(args[0].substring(0,2));
// This holds the results
Map&lt;String, String&gt; resultsMap = new HashMap&lt;&gt;();
// Make sure we actually have a map to use
if ( indexDesignationMap != null ) {
// Now take this particular map designation and turn it into the resulting map of name to values
for ( Map.Entry&lt;String,IndexDesignation&gt; mapEntry : indexDesignationMap.entrySet() ) {
resultsMap.put(mapEntry.getKey(), args[0].substring(mapEntry.getValue().startIndex,
mapEntry.getValue().endIndex));
}
}
// Print out the results (and you can assign to another object here as needed)
System.out.println( resultsMap );
}
// Could also just use a list of two elements instead of this
static class IndexDesignation {
int startIndex;
int endIndex;
public IndexDesignation( int startIndex, int endIndex ) {
this.startIndex = startIndex;
this.endIndex = endIndex;
}
}

答案3

得分: 0

以下是翻译好的部分:

我们还可以使用正则表达式模式和流来实现结果。

假设我们有一个文本文件,内容如下 -

03SomeNameSomeSurname
24SomeName10000

正则表达式模式具有用于将属性名称分配给解析文本的组名。因此,第一行的模式如下 -

^03(?&lt;name&gt;.{8})(?&lt;surname&gt;.{11})

代码如下 -

public static void main(String[] args) {

        // 固定宽度文件行
        List&lt;String&gt; fileLines = List.of(
                &quot;03SomeNameSomeSurname&quot;,
                &quot;24SomeName10000&quot;
        );
        // 列出特定文件的所有正则表达式模式
        List&lt;Pattern&gt; patternList = List.of(
                Pattern.compile(&quot;^03(?&lt;name&gt;.{8})(?&lt;surname&gt;.{11})&quot;), // 用于字符串 - 03SomeNameSomeSurname的正则表达式
                Pattern.compile(&quot;^24(?&lt;name&gt;.{8})(?&lt;salary&gt;.{5})&quot;)); // 用于字符串 - 24SomeName10000的正则表达式

        // 查找组名的模式
        Pattern groupNamePattern = Pattern.compile(&quot;\\?&lt;([a-zA-Z0-9]*)&gt;&quot;);

        List&lt;List&lt;String&gt;&gt; output  = fileLines.stream().map(
                line -&gt; patternList.stream() // 流经模式列表
                        .map(pattern -&gt; pattern.matcher(line)) // 为固定宽度行和正则表达式模式创建匹配器
                        .filter(matcher -&gt; matcher.find()) // 过滤正确匹配的匹配器
                        .map( // 将匹配器结果转换为字符串(组名=匹配值)
                                matcher -&gt;
                                        groupNamePattern.matcher(matcher.pattern().toString()).results() // 查找正则表达式模式的组名
                                                .map(groupNameMatchResult -&gt; groupNameMatchResult.group(1) + &quot;=&quot; + matcher.group(groupNameMatchResult.group(1))) // 转换为字符串(组名=匹配值)
                                .collect(Collectors.joining(&quot;,&quot;))) // 用逗号连接结果
                        .collect(Collectors.toList())
        ).collect(Collectors.toList());

        System.out.println(output);
    }

输出结果已将属性名称和属性值解析为字符串列表。

[[name=SomeName,surname=SomeSurname], [name=SomeName,salary=10000]]
英文:

We can also use regex pattern and streams to achieve the results.

Say, we have a text file like this -

03SomeNameSomeSurname
24SomeName10000

The regex pattern has group name for assigning the attribute name to the parsed text. So, the pattern for the first line is -

^03(?&lt;name&gt;.{8})(?&lt;surname&gt;.{11})

The code is -

public static void main(String[] args) {
// Fixed Width File Lines
List&lt;String&gt; fileLines = List.of(
&quot;03SomeNameSomeSurname&quot;,
&quot;24SomeName10000&quot;
);
// List all regex patterns for the specific file
List&lt;Pattern&gt; patternList = List.of(
Pattern.compile(&quot;^03(?&lt;name&gt;.{8})(?&lt;surname&gt;.{11})&quot;), // Regex for String - 03SomeNameSomeSurname
Pattern.compile(&quot;^24(?&lt;name&gt;.{8})(?&lt;salary&gt;.{5})&quot;)); // Regex For String - 24SomeName10000
// Pattern for finding Group Names
Pattern groupNamePattern = Pattern.compile(&quot;\\?&lt;([a-zA-Z0-9]*)&gt;&quot;);
List&lt;List&lt;String&gt;&gt; output  = fileLines.stream().map(
line -&gt; patternList.stream() // Stream over the pattern list
.map(pattern -&gt; pattern.matcher(line)) // Create a matcher for the fixed width line and regex pattern
.filter(matcher -&gt; matcher.find()) // Filter matcher which matches correctly
.map( // Transform matcher results into String (Group Name = Matched Value
matcher -&gt;
groupNamePattern.matcher(matcher.pattern().toString()).results() // Find Group Names for the regex pattern
.map(groupNameMatchResult -&gt; groupNameMatchResult.group(1) + &quot;=&quot; + matcher.group(groupNameMatchResult.group(1))) // Transform into String (Group Name = Matched Value)
.collect(Collectors.joining(&quot;,&quot;))) // Join results delimited with ,
.collect(Collectors.toList())
).collect(Collectors.toList());
System.out.println(output);
}

The output result has parsed the attribute name and attribute value as a List of String.

[[name=SomeName,surname=SomeSurname], [name=SomeName,salary=10000]]

huangapple
  • 本文由 发表于 2020年8月31日 00:49:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/63659831.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定