英文:
Java wrong value when reading from website
问题
我正在尝试制作一个程序,从一个网站获取一个单词,该网站始终提供一个随机的(德语)单词,并计算一个字符在该单词中出现的频率。当我尝试使用来自列表的流进行测试时,它运行得很正常。但是,如果我从网站读取,单词会通过System.out正常显示,但计算字母的频率不起作用。以下是我的代码:
public class WordCount {
public static String charStat(String urlString) throws IOException {
/* List<String> list = new ArrayList<>();
list.add("word");
Stream<String> characterStream = list.stream();*/ //每次都能正常工作
URL url = new URL(urlString);
Stream<String> characterStream = new BufferedReader(new InputStreamReader(url.openStream())).lines();
BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
System.out.println(br.readLine());//仅用于打印单词,以便我可以控制一切是否正常
int[] charNumber = new int[26];//数组大小为26,因为字母表有26个字符
Runnable func = () -> {
characterStream
.map(String::toLowerCase)
.flatMapToInt(CharSequence::chars)
.filter(c -> c != ' ')
.map(c -> c - (int) 'a')//减去'a'(ascii中的97)以便a位于数组的位置0
.forEach(i -> {charNumber[i]++;});
};
func.run();
characterStream.close();
return "a: " + charNumber[0];//返回字母a出现的次数,可以是任何字母
}
public static void main(String[] args) throws IOException {//我知道主方法不应抛出异常
System.out.println(charStat("https://randomeword.azurewebsites.net/api/word"));//从中获取单词的网站
}
}
失败示例:
单词:
Klavierkonzert
数组:
[1, 0, 0, 0, 3, 0, 2, 0, 1, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1]
应该是:
[1, 0, 0, 0, 2, 0, 0, 0, 1, 0, 2, 1, 0, 1, 1, 0, 0, 2, 0, 1, 0, 1, 0, 0, 0, 1]
我不知道为什么会发生这种情况,因为单词通过System.out.println()正确显示。所以如果我做错了什么,请告诉我。
英文:
I'm trying to make a program, which gets a word from a website, which always gives you a random (German) word and counts how frequently a character is in the word. When I try my program with a stream from a list it works fine. If I read from the website, the word is displayed fine with System.out, but counting the letters does not work as intended. Here is my code:
public class WordCount {
public static String charStat(String urlString) throws IOException {
/* List<String> list = new ArrayList<>();
list.add("word");
Stream<String> characterStream = list.stream();*/ //works totally fine every time
URL url = new URL(urlString);
Stream<String> characterStream = new BufferedReader(new InputStreamReader(url.openStream())).lines();
BufferedReader br = new BufferedReader(new InputStreamReader(url.openStream()));
System.out.println(br.readLine());//BufferedReader only used to print the word so I can control
//if everything is working
int[] charNumber = new int[26];//size is 26 cause the alphabet has 26 characters
Runnable func = () -> {
characterStream
.map(String::toLowerCase)
.flatMapToInt(CharSequence::chars)
.filter(c -> c != ' ')
.map(c -> c - (int) 'a')//subtracting 'a'(97 in ascii) so a is in position 0 of the array
.forEach(i -> {charNumber[i]++;});
};
func.run();
characterStream.close();
return "a: " + charNumber[0];//returning how many times the letter a is present, could be any letter
}
public static void main(String[] args) throws IOException {//Ik that main shouldn't throw an exception
System.out.println(charStat("https://randomeword.azurewebsites.net/api/word"));//the website im
//getting the word from
}
}
Example from a fail:
word:
Klavierkonzert
the array:
[1, 0, 0, 0, 3, 0, 2, 0, 1, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1]
the should be:
[1, 0, 0, 0, 2, 0, 0, 0, 1, 0, 2, 1, 0, 1, 1, 0, 0, 2, 0, 1, 0, 1, 0, 0, 0, 1]
I don't know why this is happening as the word gets shown correctly with System.out.println();. So if I've done anything wrong let me know.
答案1
得分: 2
使用两个URL.openStream()
调用实际上是对网站进行了两次调用,因此恢复了两个不同的单词。
只需进行一次URL.openStream()
调用,然后将结果存储在一个变量中。然后,将该变量用于所有操作,您应该可以实现所需的结果。
英文:
Using two URL.openStream()
calls is actually doing two calls to the website, thus recovering two different words.
Do only a single URL.openStream()
call, and place the result in a variable. Use that variable for all your operations and you should achieve the desired outcome.
答案2
得分: 1
以下是您要翻译的代码部分:
还有一些要注意的问题:
- 德语字母表包含不止 26 个 ASCII 字母。
- 德语名词以大写字母开头(例如 Ü)。
- 网站以某种编码发送文本,本例中为 UTF-8。
所以:
String urlString = "https://randomeword.azurewebsites.net/api/word";
URL url = new URL(urlString);
URLConnection conn = url.openConnection();
String contentType = conn.getContentType(); // "text/plain; charset=utf-8"
String charsetName = !contentType.contains("charset=") ? "UTF-8"
: contentType.replaceFirst("^.*charset=([^;]*).*$", "$1");
Charset charset = Charset.forName(charsetName); // 还会检查有效性。
try (Stream<String> lineStream = new BufferedReader(
new InputStreamReader(url.openStream(), charset)).lines()) {
lineStream.findFirst().ifPresent(word -> {
System.out.println("Word: " + word);
Map<String, Integer> frequencies
= new TreeMap<>(Collator.getInstance(Locale.GERMANY));
word.codePoints()
.mapToObj(Character::toString)
.map(s -> s.toLowerCase(Locale.GERMANY))
.forEach(s -> frequencies.merge(s, 1, Integer::sum));
System.out.println("Frequencies: " + frequencies);
});
}
}
请注意,代码中的注释已保留在翻译中。
英文:
Still an answer for some pitfalls:
- The German alphabet consist of more than 26 ASCII letters.
- German nouns start with an Uppercase letter (like Ü).
- The site sends the text in some encoding, this case in UTF-8.
So:
String urlString = "https://randomeword.azurewebsites.net/api/word";
URL url = new URL(urlString);
URLConnection conn = url.openConnection();
String contentType = conn.getContentType(); // "text/plain; charset=utf-8"
String charsetName = !contentType.contains("charset=") ? "UTF-8"
: contentType.replaceFirst("^.*charset=([^;]*).*$", "$1");
Charset charset = Charset.forName(charsetName); // Also checks validity.
try (Stream<String> lineStream = new BufferedReader(
new InputStreamReader(url.openStream(), charset)).lines()) {
lineStream.findFirst().ifPresent(word -> {
System.out.println("Word: " + word);
Map<String, Integer> frequencies
= new TreeMap<>(Collator.getInstance(Locale.GERMANY));
word.codePoints()
.mapToObj(Character::toString)
.map(s -> s.toLowerCase(Locale.GERMANY))
.forEach(s -> frequencies.merge(s, 1, Integer::sum));
System.out.println("Frequencies: " + frequencies);
});
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论