如何将多行字符串分割成包含特定行数的字符串列表?

huangapple go评论53阅读模式
英文:

How to divide a multiline string to a list of string, each containing a specific number of lines?

问题

我有一个类似这样的字符串:

1
00:01:22,416 --> 00:01:25,146
嘿,杰克,那些架子
不要放那儿。

2
00:01:25,153 --> 00:01:27,915

  • 你是什么意思?
  • 它们应该放在停车场。

3
00:01:27,921 --> 00:01:30,315
蓝色塑料制的应该放那儿。

4
00:01:30,324 --> 00:01:34,161
啊,好的,嘿,对不起,我没注意到。

我想将字符串分成一个字符串列表,每个字符串包含特定数量的行,例如,如果数字是5,每个字符串应包含5行,如果有剩余行,它们将在另一个字符串中。如何实现这个目标?

英文:

I have a string like this:

1
00:01:22,416 --> 00:01:25,146
Hey, Jack, those racks
don't go up there.

2
00:01:25,153 --> 00:01:27,915
- What do you mean?
- They go in the parking lot.

3
00:01:27,921 --> 00:01:30,315
The ones with blue plastic go up there.

4
00:01:30,324 --> 00:01:34,161
Ah, okay, hey, sorry, I didn't realize.

I want to divide the string into a list of string, each containing a specific number of lines, for example, if the number is 5, each string should contain 5 lines, and if there is remaining, it would be inside another string. How to do that?

答案1

得分: 1

这是Kotlin中的示例代码。 由于标签是Kotlin,我想你需要Kotlin代码。
您还可以在此处进行测试:https://pl.kotl.in/GnuwwQSxd

fun divideStringIntoChunks(input: String, chunkSize: Int): List<String> {
    val lines = input.lines().map { it.trim() }
    val chunks = lines.chunked(chunkSize)
    return chunks.map { it.joinToString("\n") }
}

fun main() {
    val input = """
        1
        00:01:22,416 --> 00:01:25,146
        嘿,杰克,那些货架
        不要上去。

        2
        00:01:25,153 --> 00:01:27,915
        - 你是什么意思?
        - 它们放在停车场。

        3
        00:01:27,921 --> 00:01:30,315
        带蓝色塑料的那些上去。

        4
        00:01:30,324 --> 00:01:34,161
        哦,好的,嘿,对不起,我没意识到。
    """.trimIndent()

    val chunkSize = 5
    val dividedStrings = divideStringIntoChunks(input, chunkSize)
    dividedStrings.forEachIndexed { index, str ->
        println("字符串 ${index + 1}:")
        println(str)
        println()
    }
}

请注意,我已经将字符串部分翻译成中文。

英文:

Here is a sample code in Kotlin.
As the tag is Kotlin I suppose you need Kotlin code.
You can also test it here: https://pl.kotl.in/GnuwwQSxd

fun divideStringIntoChunks(input: String, chunkSize: Int): List&lt;String&gt; {
    val lines = input.lines().map { it.trim() }
    val chunks = lines.chunked(chunkSize)
    return chunks.map { it.joinToString(&quot;\n&quot;) }
}

fun main() {
    val input = &quot;&quot;&quot;
        1
        00:01:22,416 --&gt; 00:01:25,146
        Hey, Jack, those racks
        don&#39;t go up there.

        2
        00:01:25,153 --&gt; 00:01:27,915
        - What do you mean?
        - They go in the parking lot.

        3
        00:01:27,921 --&gt; 00:01:30,315
        The ones with blue plastic go up there.

        4
        00:01:30,324 --&gt; 00:01:34,161
        Ah, okay, hey, sorry, I didn&#39;t realize.
    &quot;&quot;&quot;.trimIndent()

    val chunkSize = 5
    val dividedStrings = divideStringIntoChunks(input, chunkSize)
    dividedStrings.forEachIndexed { index, str -&gt;
        println(&quot;String ${index + 1}:&quot;)
        println(str)
        println()
    }
}

答案2

得分: 0

val countOfSubtitles = 400

val result = input
  .split("\n\n")
  .chunked(countOfSubtitles)
  .map { it.joinToString("\n\n") }

Another option  depending on what happens with the chunks later in the application  would be to alternatively transform the input string into a List of data class instances. Might make working with the data easier. Something along this:

data class Subtitle (
  val number: Int,
  val timecodeStart: String,
  val timecodeEnd: String,
  val subtitles: List<String>
)

val subtitles = input
  .trimIndent()
  .split("\n\n")
  .filter { it.isNotEmpty() }
  .map {
    val item = it.split("\n")
    Subtitle(
      number = item[0].toInt(),
      timecodeStart = item[1].substringBefore(" --> "),
      timecodeEnd = item[1].substringAfter(" --> "),
      subtitles = item.slice(2 until item.size)
    )
  }
  // .chunked(400)   // optional

subtitles.forEach(::println)

Output:

Subtitle(number=1, timecodeStart=00:01:22,416, timecodeEnd=00:01:25,146, subtitles=[Hey, Jack, those racks, don't go up there.])
Subtitle(number=2, timecodeStart=00:01:25,153, timecodeEnd=00:01:27,915, subtitles=[- What do you mean?, - They go in the parking lot.])
Subtitle(number=3, timecodeStart=00:01:27,921, timecodeEnd=00:01:30,315, subtitles=[The ones with blue plastic go up there.])
Subtitle(number=4, timecodeStart=00:01:30,324, timecodeEnd=00:01:34,161, subtitles=[Ah, okay, hey, sorry, I didn't realize.])
Subtitle(number=5, timecodeStart=00:12:34,567, timecodeEnd=00:12:56,789, subtitles=[Ah, okay, hey, sorry, I did suddenly realize.])
英文:
val countOfSubtitles = 400

val result = input
  .split(&quot;\n\n&quot;)
  .chunked(countOfSubtitles)
  .map { it.joinToString(&quot;\n\n&quot;) }

Another option – depending on what happens with the chunks later in the application – would be to alternatively transform the input string into a List of data class instances. Might make working with the data easier. Something along this:

data class Subtitle (
  val number: Int,
  val timecodeStart: String,
  val timecodeEnd: String,
  val subtitles: List&lt;String&gt;
)

val subtitles = input
  .trimIndent()
  .split(&quot;\n\n&quot;)
  .filter { it.isNotEmpty() }
  .map {
    val item = it.split(&quot;\n&quot;)
    Subtitle(
      number = item[0].toInt(),
      timecodeStart = item[1].substringBefore(&quot; --&gt; &quot;),
      timecodeEnd = item[1].substringAfter(&quot; --&gt; &quot;),
      subtitles = item.slice(2 until item.size)
    )
  }
  // .chunked(400)   // optional

subtitles.forEach(::println)

Output:

Subtitle(number=1, timecodeStart=00:01:22,416, timecodeEnd=00:01:25,146, subtitles=[Hey, Jack, those racks, don&#39;t go up there.])
Subtitle(number=2, timecodeStart=00:01:25,153, timecodeEnd=00:01:27,915, subtitles=[- What do you mean?, - They go in the parking lot.])
Subtitle(number=3, timecodeStart=00:01:27,921, timecodeEnd=00:01:30,315, subtitles=[The ones with blue plastic go up there.])
Subtitle(number=4, timecodeStart=00:01:30,324, timecodeEnd=00:01:34,161, subtitles=[Ah, okay, hey, sorry, I didn&#39;t realize.])
Subtitle(number=5, timecodeStart=00:12:34,567, timecodeEnd=00:12:56,789, subtitles=[Ah, okay, hey, sorry, I did suddenly realize.])

huangapple
  • 本文由 发表于 2023年5月24日 19:05:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/76322855.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定