当你拆分字符串时,如何在输出中包含运算符?

huangapple go评论93阅读模式
英文:

How do I include the operators in my output when I split my string?

问题

昨天我问了一个关于在Python中拆分字符串的问题。后来我决定用Go来完成这个项目。我有以下代码:

input := "house-width + 3 - y ^ (5 * house length)"
s := regexp.MustCompile(" ([+-/*^]) ").Split(input, -1)
log.Println(s)  //  [house-width 3 y (5 house length)]

如何在输出中包含运算符?例如,我希望得到以下输出:

['house-width', '+', '3', '-', 'y', '^', '(5', '*', 'house length)']

编辑:
为了澄清,我是在以空格分隔的运算符上进行拆分,而不仅仅是运算符本身。运算符两边必须有一个空格,以便将其与破折号/连字符区分开来。如果需要,可以参考我链接的原始Python问题以获得澄清。

英文:

Yesterday I asked this question about splitting a string in python. I've since decided to do this project in Go instead. I have the following:

input := "house-width + 3 - y ^ (5 * house length)"
s := regexp.MustCompile(" ([+-/*^]) ").Split(input, -1)
log.Println(s)  //  [house-width 3 y (5 house length)]

How do I include the operators in this output? e.g. I'd like the following output:

['house-width', '+', '3', '-', 'y', '^', '(5', '*', 'house length)']

EDIT:
To clarify I am splitting on the space-separated operators and not just the operator. The operator must have a space on both ends to differentiate it from a dash/hyphen. Please refer to my original python question I linked to for clarification if needed.

答案1

得分: 1

你可以使用regexp.Split()来获取表达式的操作数(就像你之前做的那样),并且可以使用regexp.FindAllString()来获取运算符(分隔符)。

通过这样做,你将得到两个独立的[]string切片,如果你想要将结果合并到一个[]string切片中,你可以合并这两个切片。

input := "house-width + 3 - y ^ (5 * house length)"

r := regexp.MustCompile(`\s([+\-/*^])\s`)

s1 := r.Split(input, -1)
s2 := r.FindAllString(input, -1)

fmt.Printf("%q\n", s1)
fmt.Printf("%q\n", s2)

all := make([]string, len(s1)+len(s2))
for i := range s1 {
    all[i*2] = s1[i]
    if i < len(s2) {
        all[i*2+1] = s2[i]
    }
}
fmt.Printf("%q\n", all)

输出结果(在Go Playground上尝试):

["house-width" "3" "y" "(5" "house length)"]
[" + " " - " " ^ " " * "]
["house-width" " + " "3" " - " "y" " ^ " "(5" " * " "house length)"]

注意:

如果你想要去除运算符周围的空格,你可以使用strings.TrimSpace()函数:

for i, v := range s2 {
    all[i*2+1] = strings.TrimSpace(v)
}
fmt.Printf("%q\n", all)

输出结果:

["house-width" "+" "3" "-" "y" "^" "(5" "*" "house length)"]
英文:

You can get the operands of your expression using regexp.Split() (just as you did) and you can get the operators (the separators) using regexp.FindAllString().

By doing this you will have 2 separate []string slices, you can merge these 2 slices if you want the result in one []string slice.

input := &quot;house-width + 3 - y ^ (5 * house length)&quot;

r := regexp.MustCompile(`\s([+\-/*^])\s`)

s1 := r.Split(input, -1)
s2 := r.FindAllString(input, -1)

fmt.Printf(&quot;%q\n&quot;, s1)
fmt.Printf(&quot;%q\n&quot;, s2)

all := make([]string, len(s1)+len(s2))
for i := range s1 {
	all[i*2] = s1[i]
	if i &lt; len(s2) {
		all[i*2+1] = s2[i]
	}
}
fmt.Printf(&quot;%q\n&quot;, all)

Output (try it on the Go Playground):

[&quot;house-width&quot; &quot;3&quot; &quot;y&quot; &quot;(5&quot; &quot;house length)&quot;]
[&quot; + &quot; &quot; - &quot; &quot; ^ &quot; &quot; * &quot;]
[&quot;house-width&quot; &quot; + &quot; &quot;3&quot; &quot; - &quot; &quot;y&quot; &quot; ^ &quot; &quot;(5&quot; &quot; * &quot; &quot;house length)&quot;]

Note:

If you want to trim the spaces from the operators, you can use the strings.TrimSpace() function for that:

for i, v := range s2 {
	all[i*2+1] = strings.TrimSpace(v)
}
fmt.Printf(&quot;%q\n&quot;, all)

Output:

[&quot;house-width&quot; &quot;+&quot; &quot;3&quot; &quot;-&quot; &quot;y&quot; &quot;^&quot; &quot;(5&quot; &quot;*&quot; &quot;house length)&quot;]

答案2

得分: 1

如果你计划在之后解析这个表达式,你需要做一些改动:

  • 将括号包含在词素中
  • 不能同时将空格和破折号作为有效的标识符字符,因为例如 - y3^ 之间将成为一个有效的标识符。

完成这些之后,你可以使用简单的线性迭代来对字符串进行词法分析:

package main

import (
	"bytes"
	"fmt"
)

func main() {

	input := `house width + 3 - y ^ (5 * house length)`
	buffr := bytes.NewBuffer(nil)
	outpt := make([]string, 0)

	for _, r := range input {
		if r == '+' || r == '-' || r == '*' || r == '/' || r == '^' || r == '(' || r == ')' || (r >= '0' && r <= '9') {
			bs := bytes.TrimSpace(buffr.Bytes())
			if len(bs) > 0 {
				outpt = append(outpt, string(bs))
			}
			outpt = append(outpt, string(r))
			buffr.Reset()
		} else {
			buffr.WriteRune(r)
		}
	}

	fmt.Printf("%#v\n", outpt)

}

词法分析完成后,使用Dijkstra的逆波兰算法构建一个AST或直接计算表达式。

英文:

If you're planing to parse the expression afterwards you'll have to make some changes:

  • Include parentheses as lexemes
  • You can't have both spaces and dashes be valid identifier characters because e.g. - y inbetween 3 and ^ would be a valid identifier.

After that's done, you can use use a simple linear iteration to lex your string:

package main

import (
	&quot;bytes&quot;
	&quot;fmt&quot;
)

func main() {

	input := `house width + 3 - y ^ (5 * house length)`
	buffr := bytes.NewBuffer(nil)
	outpt := make([]string, 0)

	for _, r := range input {
		if r == &#39;+&#39; || r == &#39;-&#39; || r == &#39;*&#39; || r == &#39;/&#39; || r == &#39;^&#39; || r == &#39;(&#39; || r == &#39;)&#39; || (r &gt;= &#39;0&#39; &amp;&amp; r &lt;= &#39;9&#39;) {
			bs := bytes.TrimSpace(buffr.Bytes())
			if len(bs) &gt; 0 {
				outpt = append(outpt, (string)(bs))
			}
			outpt = append(outpt, (string)(r))
			buffr.Reset()
		} else {
			buffr.WriteRune(r)
		}
	}

	fmt.Printf(&quot;%#v\n&quot;, outpt)

}

Once lexed, use Dijkstra's shunting-yard algorithm to build an AST or directly evaluate the expression.

答案3

得分: 0

我认为FindAll()可能是一个可行的方法。

扩展的正则表达式:

 \s*                           # 去除前导空格
 (                             # (1 开始),运算符/非运算符字符
      (?:                           # 集群组
           \w -                          # 单词破折号
        |  - \w                          # 或者,破折号单词
        |  [^+\-/*^]                     # 或者,非运算符字符
      )+                            # 结束集群,出现1次或多次
   |                              # 或者,
      [+\-/*^]                      # 单个简单的数学运算符
 )                             # (1 结束)
 \s*                           # 去除尾随空格

Go代码片段:

http://play.golang.org/p/bHZ21B6Tzi

package main

import (
"log"
"regexp"
)
func main() {
    in := []byte("house-width + 3 - y ^ (5 * house length)")
    rr := regexp.MustCompile("\\s*((?:\\w-|-\\w|[^+\\-/*^])+|[+\\-/*^])\\s*")
    s := r.FindAll( in, -1 )
    for _, ss:=range s{
      log.Println(string(ss))
    }
}

输出:

 house-width 
 + 
 3 
 - 
 y 
 ^ 
 (5 
 * 
 house length)
英文:

I think FindAll() may be the way to go.

Expanded regex:

 \s*                           # Trim preceding whitespace
 (                             # (1 start), Operator/Non-Operator chars
      (?:                           # Cluster group
           \w -                          # word dash
        |  - \w                          # or, dash word
        |  [^+\-/*^]                     # or, a non-operator char
      )+                            # End cluster, do 1 to many times
   |                              # or,
      [+\-/*^]                      # A single simple math operator
 )                             # (1 end)
 \s*                           # Trim trailing whitespace

Go code snippet:

http://play.golang.org/p/bHZ21B6Tzi

package main

import (
&quot;log&quot;
&quot;regexp&quot;
)
func main() {
	in := []byte(&quot;house-width + 3 - y ^ (5 * house length)&quot;)
	rr := regexp.MustCompile(&quot;\\s*((?:\\w-|-\\w|[^+\\-/*^])+|[+\\-/*^])\\s*&quot;)
	s := r.FindAll( in, -1 )
	for _, ss:=range s{
	  log.Println(string(ss))
	}
}

Output:

 house-width 
 + 
 3 
 - 
 y 
 ^ 
 (5 
 * 
 house length)

huangapple
  • 本文由 发表于 2015年6月4日 08:44:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/30633189.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定