在一个特定文件中展示带有TreeSet的嵌套HashMaps,以显示单词的索引位置。

huangapple go评论72阅读模式
英文:

Displayed nested HashMaps with a TreeSet to display index of the word in a certain file

问题

我想以 JSON 格式显示一个包含在文件中的单词列表,第一个键是单词,第二个键是它所在的文件,值是单词在文件中被找到时的索引。唯一的问题是,似乎新的 TreeSet<Integer> 存在问题,因为我的所有单词都有相同的 HashMap<String, TreeSet<Integer>>。它们都有相同的嵌套 HashMap,但我希望它们每一个都是独立的。希望能得到一点帮助。以下是我的代码:

public static HashMap<String, HashMap<String, TreeSet<Integer>>> listStems(Path inputFile) throws IOException {
    HashMap<String, HashMap<String, TreeSet<Integer>>> finalString = new HashMap<String, HashMap<String, TreeSet<Integer>>>();
    HashMap<String, TreeSet<Integer>> mapString = new HashMap<String, TreeSet<Integer>>();
    int counter = 0;
    Stemmer stemmer = new SnowballStemmer(DEFAULT);
    try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile.toString()), "UTF-8"));) {
        String line;
        while ((line = br.readLine()) != null) {
            String[] toStemArray = parse(line);

            for (int i = 0; i < toStemArray.length; i++) {
                counter++;
                if (!finalString.containsKey(toStemArray[i])) {
                    mapString.put(inputFile.toString(), new TreeSet<Integer>());
                    finalString.put(toStemArray[i], mapString);
                    finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
                } else if (finalString.containsKey(toStemArray[i])) {
                    finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
                }
            }
        }
    }
    return finalString;
}
英文:

I want to display ( in JSON format ) a list of words contained in a file, with first key being the word, second key being the file it's from and values are the index of the word when it's found in the file.Only problem is it seems like the new TreeSet<Integer> has a problem because all my words have the same HashMap<String, TreeSet<Integer>>. They all have the same nested HashMap but I want every one of them to be individual and independent of course. Would love a little help.
Here is my code:

public static HashMap&lt;String, HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;&gt; listStems(Path inputFile) throws 
IOException {
	HashMap&lt;String, HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;&gt; finalString = new HashMap&lt;String, 
    HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;&gt;();
	HashMap&lt;String, TreeSet&lt;Integer&gt;&gt; mapString = new HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;();
	int counter=0;
	Stemmer stemmer = new SnowballStemmer(DEFAULT);
	try (BufferedReader br =
			new BufferedReader(new InputStreamReader(
				    new FileInputStream(inputFile.toString()), &quot;UTF-8&quot;));) {	
				String line;
				while((line = br.readLine()) != null) {
					String[] toStemArray = parse(line);
					
					for(int i = 0;i&lt;toStemArray.length;i++) {
						counter++;
						if(!finalString.containsKey(toStemArray[i])) {
							mapString.put(inputFile.toString(), new TreeSet&lt;Integer&gt;());
							finalString.put(toStemArray[i], mapString);
							finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
						}
						else if(finalString.containsKey(toStemArray[i])) {
							finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
						}
					}
				}
	}		
	return finalString;
}

答案1

得分: 1

你所有的 HashMap<String, TreeSet> 实例都是相同的,因为你只在方法开始时创建了一个单独的实例 (mapString),然后重复使用它。

在你的内部 if 语句中,你检查是否之前已经见过这个单词,如果没有,你就会向你那唯一的 HashMap<String, TreeSet> 添加一个条目,将文件名映射到一个新的空的 TreeSet<Integer>。这几乎是正确的模式--你检测到了一个新单词并创建了一个新的 TreeSet,但没有创建一个新的 HashMap<String, TreeSet>

如果你想要每个单词有一个独立的 HashMap<String, TreeSet>,你需要在每次遇到一个新单词时都创建一个新的实例,而不仅仅是一次。将你的 new HashMap<String, TreeSet<Integer>>() 移到 mapString.put 这一行的前面,你几乎就可以让它工作:你会为每个单词有一个独立的 HashMap<String, TreeSet>,但现在你只会创建一个单独的 TreeSet

以同样的方式进行修正(如果之前没有为那个单词的文件创建过 TreeSet,就创建一个新的 TreeSet),你应该就可以解决这个问题!

英文:

All of your HashMap&lt;String, TreeSet&gt; instances are the same because you only create a single instance (mapString) at the start of your method, then re-use it.

In your inner if statement, you check to see if you've seen the word before, and if you haven't you add an entry to your one single HashMap&lt;String, TreeSet&gt; that maps the file name to a new, empty TreeSet&lt;Integer&gt;. That's almost the right pattern--you're detecting a new word and creating a new TreeSet, but not creating a new HashMap&lt;String, TreeSet&gt;.

If you want to have one HashMap&lt;String, TreeSet&gt; per word, you'll need to create a new one every time you see a new word instead of just once. Move your new HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;() to immediate before the mapString.put line and you'll almost have it working: you'll have one HashMap&lt;String, TreeSet&gt; per word but now you're only creating a single TreeSet.

Fix that the same way (by making a new TreeSet if you haven't seen that file for that word before) and you should be good!

huangapple
  • 本文由 发表于 2020年9月25日 04:49:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/64054203.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定