你能在golang中使用不包含单词的正则表达式来拆分字符串吗?

huangapple go评论83阅读模式
英文:

Can you regexp.Split a string with regexp not containing word in golang?

问题

我写了一个函数,它接受一个字符串列表,列表项分隔符正则表达式和范围分隔符正则表达式,如下所示:

func ParseList(list string, ls, rs *regexp.Regexp) (...) {
  ...
  for _, item := range ls.Split(list, -1) {
    for _, rng := range rs.Split(item, -1) {
      range_start, _ := strconv.ParseInt(rng[0])
      range_end := range_start
      if len(rng) > 1 {
        range_end, _ = strconv.ParseInt(rng[1])
      }
      ... 保存或分析范围
    }
  }
  ...
}

输入字符串可以是以下形式之一:
1,3,5-10,15-20
1;3;5-10;15-20
1 3 5 to 10 15 to 20

所以对于第一个例子,正则表达式很简单,但是第三个例子会有问题,因为列表分隔符是空格,范围分隔符也是空格。

更新:

所以我按照Trung Duong的建议,从"splitter"切换到"extractor"正则表达式,如下所示:

func ParseList(list string, ex, rs *regexp.Regexp) (...) {
  ...
  for _, item := range ex.FindAllString(list, -1) {
    for _, rng := range rs.Split(item, -1) {
      range_start, _ := strconv.ParseInt(rng[0])
      range_end := range_start
      if len(rng) > 1 {
        range_end, _ = strconv.ParseInt(rng[1])
      }
      ... 保存或分析范围
    }
  }
  ...
}

现在我可以使用(\d+-\d+)|(\d+)(\d+ to \d+)|(\d+)来提取范围,然后再进行分割。

英文:

I write a function which takes a string list of numbers, list items separator reg and range separator reg like:

func ParseList(list string, ls, rs *regexp.Regexp) (...) {
  ...
  for _, item := range ls.Split(list, -1) {
    for _, rng := range rs.Split(item, -1) {
      range_start, _ := strconv.ParseInt(rng[0])
      range_end := range_start
      if len(rng) > 1 {
        range_end, _ = strconv.ParseInt(rng[1])
      }
      ... save or analyse range
    }
  }
  ...
}

Input string could be, but not limited to:
1,3,5-10,15-20
1;3;5-10;15-20
1 3 5 to 10 15 to 20

So for first example regexps are easy, but third gives a problem, because list separator is " " and range separator also has " ".

UPDATE:

So i followed Trung Duong's suggestion and switched from "splitter" to "extractor" regexp like this:

func ParseList(list string, ex, rs *regexp.Regexp) (...) {
  ...
  for _, item := range ex.FindAllString(list, -1) {
    for _, rng := range rs.Split(item, -1) {
      range_start, _ := strconv.ParseInt(rng[0])
      range_end := range_start
      if len(rng) > 1 {
        range_end, _ = strconv.ParseInt(rng[1])
      }
      ... save or analyse range
    }
  }
  ...
}

So now i can use (\d+-\d+)|(\d+) or (\d+ to \d+)|(\d+) to extract ranges, and then split them.

答案1

得分: 2

你可以使用这个模式:(\d+\s+to\s+\d+)|(\d+)

在这里查看正则表达式演示链接

英文:

You could use this pattern: (\d+\s+to\s+\d+)|(\d+)

See regex demo here.

答案2

得分: 0

这是你需要的算法。我们可以通过迭代来实现。

package main

import (
    "fmt"
    "strconv"
    "strings"
)

func main() {
    inputString := "1 2 7 to 10 15 16 to 20"

    // 通过空格将字符串拆分为单个元素
    elements := strings.Split(inputString, " ")

    // 初始化一个切片来存储输出
    output := []string{}

    // 遍历元素
    i := 0
    for i < len(elements) {
        // 如果元素包含 "to",则表示它是一个范围
        if strings.Contains(elements[i], "to") {
            // 获取范围的起始和结束
            rangeEnds := strings.Split(elements[i], "to")
            start, _ := strconv.Atoi(rangeEnds[0])
            end, _ := strconv.Atoi(rangeEnds[1])
            // 将范围添加到输出
            output = append(output, fmt.Sprintf("%d to %d", start, end))
            i++
        } else {
            // 如果元素不是范围,则直接将其添加到输出
            output = append(output, elements[i])
            i++
        }
    }

    // 打印输出
    for _, element := range output {
        fmt.Println(element)
    }
}

输出:

1
2
7 to 10
15
16 to 20

英文:

Here is the algorithm You need.
We can simply do this with the help of iterations.

package main

import (
    &quot;fmt&quot;
    &quot;strconv&quot;
    &quot;strings&quot;
)

func main() {
    inputString := &quot;1 2 7 to 10 15 16 to 20&quot;

    // Split the string by space to get individual elements
    elements := strings.Split(inputString, &quot; &quot;)

    // Initialize a slice to store the output
    output := []string{}

    // Iterate through the elements
    i := 0
    for i &lt; len(elements) {
        // If the element contains &quot;to&quot;, it means it&#39;s a range
        if strings.Contains(elements[i], &quot;to&quot;) {
            // Get the start and end of the range
            rangeEnds := strings.Split(elements[i], &quot;to&quot;)
            start, _ := strconv.Atoi(rangeEnds[0])
            end, _ := strconv.Atoi(rangeEnds[1])
            // Add the range to the output
            output = append(output, fmt.Sprintf(&quot;%d to %d&quot;, start, end))
            i++
        } else {
            // If the element is not a range, simply add it to the output
            output = append(output, elements[i])
            i++
        }
    }

    // Print the output
    for _, element := range output {
        fmt.Println(element)
    }
}

OUTPUT:

1
2
7 to 10
15
16 to 20

huangapple
  • 本文由 发表于 2023年3月27日 14:12:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/75852589.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定