Kotlin字符串最大长度?(具有较长字符串的Kotlin文件未能编译)

huangapple go评论132阅读模式
英文:

Kotlin String max length? (Kotlin file with a long String is not compiling)

问题

根据这个答案,Java 可以容纳最多 2^31 - 1 个字符。我试图进行基准测试等操作,所以我尝试创建大量的字符串,并将其写入文件,就像这样:

import java.io.*

fun main() {
    val out = File("ouput.txt").apply { createNewFile() }.printWriter()
    sequence {
        var x = 0
        while (true) {
            yield("${++x} ${++x} ${++x} ${++x} ${++x}")
        }
    }.take(5000).forEach { out.println(it) }
    out.close()
}

然后 output.txt 文件的内容如下所示:

1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
// ... 5000 行

然后我将文件的所有内容复制到一个字符串中,以便对某些函数进行基准测试,代码如下:

import kotlin.system.*

fun main() {
    val inputs = """
        1 2 3 4 5
        6 7 8 9 10
        11 12 13 14 15
        16 17 18 19 20
        21 22 23 24 25
        // ... 5000 行
        24986 24987 24988 24989 24990
        24991 24992 24993 24994 24995
        24996 24997 24998 24999 25000
    """.trimIndent()
    measureNanoTime {
        inputs.reader().forEachLine { line ->
            val (a, b, c, d, e) = line.split(" ").map(String::toInt)
        }
    }.div(5000).let(::println)
}

> 文件/字符串的总字符数为 138894
>
> 字符串最多可以容纳 2147483647 个字符

但是 Kotlin 代码无法编译(最后的文件) 它抛出编译错误:

e: org.jetbrains.kotlin.codegen.CompilationException: Back-end (JVM) Internal error: wrong bytecode generated
// 更多行
The root cause java.lang.IllegalArgumentException was thrown at: org.jetbrains.org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:246)
    at org.jetbrains.kotlin.codegen.TransformationMethodVisitor.visitEnd(TransformationMethodVisitor.kt:92)
    at org.jetbrains.kotlin.codegen.FunctionCodegen.endVisit(FunctionCodegen.java:971)
    ... 43 more
Caused by: java.lang.IllegalArgumentException: UTF8 string too large

这是异常的完整日志和堆栈跟踪:https://gist.github.com/Animeshz/1a18a7e99b0c0b913027b7fb36940c07

英文:

According to this answer Java can hold up to 2^31 - 1 characters. I was trying to do benchmarking and stuffs, so I tried to create a large amount of string and write it to a file like this:

import java.io.*

fun main() {
    val out = File("ouput.txt").apply { createNewFile() }.printWriter()
    sequence {
        var x = 0
        while (true) {
            yield("${++x} ${++x} ${++x} ${++x} ${++x}")
        }
    }.take(5000).forEach { out.println(it) }
    out.close()
}

And then the output.txt file contains like this:

1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
// ... 5000 lines

And then I copied all the contents of the file into a string for some benchmarking of some functions, so this is how it looks:

import kotlin.system.*

fun main() {
    val inputs = """
        1 2 3 4 5
        6 7 8 9 10
        11 12 13 14 15
        16 17 18 19 20
        21 22 23 24 25
        // ... 5000 lines
        24986 24987 24988 24989 24990
        24991 24992 24993 24994 24995
        24996 24997 24998 24999 25000

    """.trimIndent()
    measureNanoTime {
        inputs.reader().forEachLine { line ->
            val (a, b, c, d, e) = line.split(" ").map(String::toInt)
        }
    }.div(5000).let(::println)
}

> The total character count of the file/string is 138894
>
> String can hold up to 2147483647

But the Kotlin code does not compile (last file) It throws compilation error:

e: org.jetbrains.kotlin.codegen.CompilationException: Back-end (JVM) Internal error: wrong bytecode generated
// more lines
The root cause java.lang.IllegalArgumentException was thrown at: org.jetbrains.org.objectweb.asm.ByteVector.putUTF8(ByteVector.java:246)
    at org.jetbrains.kotlin.codegen.TransformationMethodVisitor.visitEnd(TransformationMethodVisitor.kt:92)
    at org.jetbrains.kotlin.codegen.FunctionCodegen.endVisit(FunctionCodegen.java:971)
    ... 43 more
Caused by: java.lang.IllegalArgumentException: UTF8 string too large

Here is the total log of exception with stacktrace: https://gist.github.com/Animeshz/1a18a7e99b0c0b913027b7fb36940c07

答案1

得分: 4

在Java类文件中有限制,字符串常量的长度必须适应16位,即源代码中的字符串最大长度为65535字节(而不是字符)。

类文件格式

英文:

There is limit in java class file, length of string constant must fit in 16 bits ie. 65535 bytes (not characters) is max length of string in source code.

The class File Format

huangapple
  • 本文由 发表于 2020年5月30日 13:32:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/62098263.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定