2020年9月25日 04:49:54go评论78阅读模式

英文:

Displayed nested HashMaps with a TreeSet to display index of the word in a certain file

问题

我想以 JSON 格式显示一个包含在文件中的单词列表，第一个键是单词，第二个键是它所在的文件，值是单词在文件中被找到时的索引。唯一的问题是，似乎新的 TreeSet<Integer> 存在问题，因为我的所有单词都有相同的 HashMap<String, TreeSet<Integer>>。它们都有相同的嵌套 HashMap，但我希望它们每一个都是独立的。希望能得到一点帮助。以下是我的代码：

public static HashMap<String, HashMap<String, TreeSet<Integer>>> listStems(Path inputFile) throws IOException {
    HashMap<String, HashMap<String, TreeSet<Integer>>> finalString = new HashMap<String, HashMap<String, TreeSet<Integer>>>();
    HashMap<String, TreeSet<Integer>> mapString = new HashMap<String, TreeSet<Integer>>();
    int counter = 0;
    Stemmer stemmer = new SnowballStemmer(DEFAULT);
    try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(inputFile.toString()), "UTF-8"));) {
        String line;
        while ((line = br.readLine()) != null) {
            String[] toStemArray = parse(line);

            for (int i = 0; i < toStemArray.length; i++) {
                counter++;
                if (!finalString.containsKey(toStemArray[i])) {
                    mapString.put(inputFile.toString(), new TreeSet<Integer>());
                    finalString.put(toStemArray[i], mapString);
                    finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
                } else if (finalString.containsKey(toStemArray[i])) {
                    finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
                }
            }
        }
    }
    return finalString;
}

英文:

I want to display ( in JSON format ) a list of words contained in a file, with first key being the word, second key being the file it's from and values are the index of the word when it's found in the file.Only problem is it seems like the new TreeSet<Integer> has a problem because all my words have the same HashMap<String, TreeSet<Integer>>. They all have the same nested HashMap but I want every one of them to be individual and independent of course. Would love a little help.
Here is my code:

public static HashMap&lt;String, HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;&gt; listStems(Path inputFile) throws 
IOException {
	HashMap&lt;String, HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;&gt; finalString = new HashMap&lt;String, 
    HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;&gt;();
	HashMap&lt;String, TreeSet&lt;Integer&gt;&gt; mapString = new HashMap&lt;String, TreeSet&lt;Integer&gt;&gt;();
	int counter=0;
	Stemmer stemmer = new SnowballStemmer(DEFAULT);
	try (BufferedReader br =
			new BufferedReader(new InputStreamReader(
				    new FileInputStream(inputFile.toString()), &quot;UTF-8&quot;));) {	
				String line;
				while((line = br.readLine()) != null) {
					String[] toStemArray = parse(line);
					
					for(int i = 0;i&lt;toStemArray.length;i++) {
						counter++;
						if(!finalString.containsKey(toStemArray[i])) {
							mapString.put(inputFile.toString(), new TreeSet&lt;Integer&gt;());
							finalString.put(toStemArray[i], mapString);
							finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
						}
						else if(finalString.containsKey(toStemArray[i])) {
							finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
						}
					}
				}
	}		
	return finalString;
}

答案1

得分: 1

你所有的 HashMap<String, TreeSet> 实例都是相同的，因为你只在方法开始时创建了一个单独的实例 (mapString)，然后重复使用它。

在你的内部 if 语句中，你检查是否之前已经见过这个单词，如果没有，你就会向你那唯一的 HashMap<String, TreeSet> 添加一个条目，将文件名映射到一个新的空的 TreeSet<Integer>。这几乎是正确的模式--你检测到了一个新单词并创建了一个新的 TreeSet，但没有创建一个新的 HashMap<String, TreeSet>。

如果你想要每个单词有一个独立的 HashMap<String, TreeSet>，你需要在每次遇到一个新单词时都创建一个新的实例，而不仅仅是一次。将你的 new HashMap<String, TreeSet<Integer>>() 移到 mapString.put 这一行的前面，你几乎就可以让它工作：你会为每个单词有一个独立的 HashMap<String, TreeSet>，但现在你只会创建一个单独的 TreeSet。

以同样的方式进行修正（如果之前没有为那个单词的文件创建过 TreeSet，就创建一个新的 TreeSet），你应该就可以解决这个问题！

英文:

All of your HashMap<String, TreeSet> instances are the same because you only create a single instance (mapString) at the start of your method, then re-use it.

In your inner if statement, you check to see if you've seen the word before, and if you haven't you add an entry to your one single HashMap<String, TreeSet> that maps the file name to a new, empty TreeSet<Integer>. That's almost the right pattern--you're detecting a new word and creating a new TreeSet, but not creating a new HashMap<String, TreeSet>.

If you want to have one HashMap<String, TreeSet> per word, you'll need to create a new one every time you see a new word instead of just once. Move your new HashMap<String, TreeSet<Integer>>() to immediate before the mapString.put line and you'll almost have it working: you'll have one HashMap<String, TreeSet> per word but now you're only creating a single TreeSet.

Fix that the same way (by making a new TreeSet if you haven't seen that file for that word before) and you should be good!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在一个特定文件中展示带有TreeSet的嵌套HashMaps，以显示单词的索引位置。

问题

答案1

How to search a string in the elasticsearch document(indexed) in golang?

在Android 19中创建一个叠加窗口。

实现接口的类不被通配符泛型所接受。

无法在startActivityForResult中识别int？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论