What is the best way to test for an empty string in Go?

huangapple go评论88阅读模式
英文:

What is the best way to test for an empty string in Go?

问题

在Go语言中,用于测试非空字符串的最佳(最符合惯用法)方法是什么?

if len(mystring) > 0 { }

还是:

if mystring != "" { }

还是其他方法?

英文:

Which method is best (most idomatic) for testing non-empty strings (in Go)?

if len(mystring) > 0 { }

Or:

if mystring != "" { }

Or something else?

答案1

得分: 682

两种风格都在Go的标准库中使用。

strconv包中可以找到以下代码:http://golang.org/src/pkg/strconv/atoi.go

if len(s) > 0 { ... }

encoding/json包中可以找到以下代码:http://golang.org/src/pkg/encoding/json/encode.go

if s != "" { ... }

这两种风格都是惯用的,足够清晰。这更多是个人口味和清晰度的问题。

Russ Cox在golang-nuts的讨论帖中写道:

选择能让代码清晰的方式。
如果我要查看元素x,我通常会写len(s) > x,即使x等于0,但如果我关心“它是这个特定的字符串吗”,我倾向于写s == ""。

可以合理地假设成熟的编译器会将len(s) == 0和s == ""编译成相同的高效代码。

...

让代码清晰。

正如Timmmm的回答中指出的那样,Go编译器在这两种情况下生成的代码是相同的。

英文:

Both styles are used within the Go's standard libraries.

if len(s) > 0 { ... }

can be found in the strconv package: http://golang.org/src/pkg/strconv/atoi.go

if s != "" { ... }

can be found in the encoding/json package: http://golang.org/src/pkg/encoding/json/encode.go

Both are idiomatic and are clear enough. It is more a matter of personal taste and about clarity.

Russ Cox writes in a golang-nuts thread:

>The one that makes the code clear.
If I'm about to look at element x I typically write
len(s) > x, even for x == 0, but if I care about
"is it this specific string" I tend to write s == "".
>
It's reasonable to assume that a mature compiler will compile
len(s) == 0 and s == "" into the same, efficient code.
>...
>
Make the code clear.

As pointed out in Timmmm's answer, the Go compiler does generate identical code in both cases.

答案2

得分: 45

这似乎是过早的微优化。编译器可以自由地为这两种情况产生相同的代码,或者至少是相似的代码。

因为语义明显是相等的,所以可以这样写:

if len(s) != 0 { ... }

if s != "" { ... }
英文:

This seems to be premature microoptimization. The compiler is free to produce the same code for both cases or at least for these two

if len(s) != 0 { ... }

and

if s != "" { ... }

because the semantics is clearly equal.

答案3

得分: 39

假设需要移除空格和所有前导和尾随的空白字符:

import "strings"
if len(strings.TrimSpace(s)) == 0 { ... }

原因是:
len("") // 结果为 0
len(" ") // 一个空格的长度为 1
len(" ") // 两个空格的长度为 2

英文:

Assuming that empty spaces and all leading and trailing white spaces should be removed:

import "strings"
if len(strings.TrimSpace(s)) == 0 { ... }

Because :
len("") // is 0
len(" ") // one empty space is 1
len(" ") // two empty spaces is 2

答案4

得分: 28

检查长度是一个好的答案,但你还可以考虑到一个“空”的字符串,它也只包含空格。虽然不是“技术上”为空,但如果你想要检查的话:

package main

import (
  "fmt"
  "strings"
)

func main() {
  stringOne := "merpflakes"
  stringTwo := "   "
  stringThree := ""

  if len(strings.TrimSpace(stringOne)) == 0 {
    fmt.Println("String is empty!")
  }
  
  if len(strings.TrimSpace(stringTwo)) == 0 {
    fmt.Println("String two is empty!")
  }
  
  if len(stringTwo) == 0 {
    fmt.Println("String two is still empty!")
  }
  
  if len(strings.TrimSpace(stringThree)) == 0 {
    fmt.Println("String three is empty!")
  }
}
英文:

Checking for length is a good answer, but you could also account for an "empty" string that is also only whitespace. Not "technically" empty, but if you care to check:

package main

import (
  "fmt"
  "strings"
)

func main() {
  stringOne := "merpflakes"
  stringTwo := "   "
  stringThree := ""

  if len(strings.TrimSpace(stringOne)) == 0 {
    fmt.Println("String is empty!")
  }
  
  if len(strings.TrimSpace(stringTwo)) == 0 {
    fmt.Println("String two is empty!")
  }
  
  if len(stringTwo) == 0 {
    fmt.Println("String two is still empty!")
  }
  
  if len(strings.TrimSpace(stringThree)) == 0 {
    fmt.Println("String three is empty!")
  }
}

答案5

得分: 18

截至目前,Go编译器在这两种情况下生成的代码是相同的,所以这只是个人口味的问题。GCCGo生成的代码确实不同,但几乎没有人使用它,所以我不会担心这个问题。

https://godbolt.org/z/fib1x1

英文:

As of now, the Go compiler generates identical code in both cases, so it is a matter of taste. GCCGo does generate different code, but barely anyone uses it so I wouldn't worry about that.

https://godbolt.org/z/fib1x1

答案6

得分: 16

根据官方指南和性能角度来看,它们在效果上是等价的(ANisus的回答)。由于语法上的优势,s != "" 更好。如果变量不是字符串,s != "" 在编译时会失败,而 len(s) == 0 对于其他几种数据类型会通过。

英文:

As per official guidelines and from performance point of view they appear equivalent (ANisus answer), the s != "" would be better due to a syntactical advantage. s != "" will fail at compile time if the variable is not a string, while len(s) == 0 will pass for several other data types.

答案7

得分: 6

我认为== ""更快且更易读。

package main

import (
	"fmt"
)

func main() {
	n := 1
	s := ""
	if len(s) == 0 {
		n = 2
	}
	fmt.Println("%d", n)
}

当我使用dlv debug playground.golen(s)== ""进行比较时,我得到了以下结果:

s == ""的情况:

playground.go:6         0x1008d9d20     810b40f9        MOVD 16(R28), R1       
playground.go:6         0x1008d9d24     e28300d1        SUB $32, RSP, R2       
playground.go:6         0x1008d9d28     5f0001eb        CMP R1, R2             
playground.go:6         0x1008d9d2c     09070054        BLS 56(PC)             
playground.go:6         0x1008d9d30*    fe0f16f8        MOVD.W R30, -160(RSP)  

playground.go:6         0x1008d9d34     fd831ff8        MOVD R29, -8(RSP)      
playground.go:6         0x1008d9d38     fd2300d1        SUB $8, RSP, R29       
playground.go:7         0x1008d9d3c     e00340b2        ORR $1, ZR, R0         

--------------------------------------------------------------------------------------
playground.go:7         0x1008d9d40     e01f00f9        MOVD R0, 56(RSP)       
playground.go:8         0x1008d9d44     ff7f05a9        STP (ZR, ZR), 80(RSP)  
--------------------------------------------------------------------------------------

playground.go:9         0x1008d9d48     01000014        JMP 1(PC)                        
playground.go:10        0x1008d9d4c     e0037fb2        ORR $2, ZR, R0         

len(s) == 0的情况:

playground.go:6         0x100761d20     810b40f9        MOVD 16(R28), R1       
playground.go:6         0x100761d24     e2c300d1        SUB $48, RSP, R2       
playground.go:6         0x100761d28     5f0001eb        CMP R1, R2             
playground.go:6         0x100761d2c     29070054        BLS 57(PC)             
playground.go:6         0x100761d30*    fe0f15f8        MOVD.W R30, -176(RSP)  

playground.go:6         0x100761d34     fd831ff8        MOVD R29, -8(RSP)      
playground.go:6         0x100761d38     fd2300d1        SUB $8, RSP, R29       
playground.go:7         0x100761d3c     e00340b2        ORR $1, ZR, R0         

--------------------------------------------------------------------------------------
playground.go:7         0x100761d40     e02300f9        MOVD R0, 64(RSP)       
playground.go:8         0x100761d44     ff7f06a9        STP (ZR, ZR), 96(RSP)  
playground.go:9         0x100761d48     ff2700f9        MOVD ZR, 72(RSP)       
--------------------------------------------------------------------------------------

playground.go:9         0x100761d4c     01000014        JMP 1(PC)              
playground.go:10        0x100761d50     e0037fb2        ORR $2, ZR, R0         
playground.go:10        0x100761d54     e02300f9        MOVD R0, 64(RSP)       
playground.go:10        0x100761d58     01000014        JMP 1(PC)      
playground.go:6         0x104855d2c     09070054        BLS 56(PC)        
英文:

I think == "" is faster and more readable.

package main 

import(
    "fmt"
)
func main() {
	n := 1
    s:=""
    if len(s)==0{
	    n=2
    }
    fmt.Println("%d", n)
}

when dlv debug playground.go cmp with len(s) and =="" I got this
s == "" situation

    playground.go:6         0x1008d9d20     810b40f9        MOVD 16(R28), R1       
    playground.go:6         0x1008d9d24     e28300d1        SUB $32, RSP, R2       
    playground.go:6         0x1008d9d28     5f0001eb        CMP R1, R2             
    playground.go:6         0x1008d9d2c     09070054        BLS 56(PC)             
    playground.go:6         0x1008d9d30*    fe0f16f8        MOVD.W R30, -160(RSP)  

    playground.go:6         0x1008d9d34     fd831ff8        MOVD R29, -8(RSP)      
    playground.go:6         0x1008d9d38     fd2300d1        SUB $8, RSP, R29       
    playground.go:7         0x1008d9d3c     e00340b2        ORR $1, ZR, R0         

    playground.go:7         0x1008d9d40     e01f00f9        MOVD R0, 56(RSP)       
    playground.go:8         0x1008d9d44     ff7f05a9        STP (ZR, ZR), 80(RSP)  

    playground.go:9         0x1008d9d48     01000014        JMP 1(PC)                        
    playground.go:10        0x1008d9d4c     e0037fb2        ORR $2, ZR, R0         

len(s)==0 situation

    playground.go:6         0x100761d20     810b40f9        MOVD 16(R28), R1       
    playground.go:6         0x100761d24     e2c300d1        SUB $48, RSP, R2       
    playground.go:6         0x100761d28     5f0001eb        CMP R1, R2             
    playground.go:6         0x100761d2c     29070054        BLS 57(PC)             
    playground.go:6         0x100761d30*    fe0f15f8        MOVD.W R30, -176(RSP)  

    playground.go:6         0x100761d34     fd831ff8        MOVD R29, -8(RSP)      
    playground.go:6         0x100761d38     fd2300d1        SUB $8, RSP, R29       
    playground.go:7         0x100761d3c     e00340b2        ORR $1, ZR, R0         

    playground.go:7         0x100761d40     e02300f9        MOVD R0, 64(RSP)       
    playground.go:8         0x100761d44     ff7f06a9        STP (ZR, ZR), 96(RSP)  
    playground.go:9         0x100761d48     ff2700f9        MOVD ZR, 72(RSP)       

    playground.go:9         0x100761d4c     01000014        JMP 1(PC)              
    playground.go:10        0x100761d50     e0037fb2        ORR $2, ZR, R0         
    playground.go:10        0x100761d54     e02300f9        MOVD R0, 64(RSP)       
    playground.go:10        0x100761d58     01000014        JMP 1(PC)      
    playground.go:6         0x104855d2c     09070054        BLS 56(PC)        

> Blockquote

答案8

得分: 1

只是为了补充一下评论

主要是关于如何进行性能测试。

我使用以下代码进行了测试:

import (
    "testing"
)

var ss = []string{"Hello", "", "bar", " ", "baz", "ewrqlosakdjhf12934c r39yfashk fjkashkfashds fsdakjh-", "", "123"}

func BenchmarkStringCheckEq(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        for _, s := range ss {
            if s == "" {
                c++
            }
        }
    }
    t := 2 * b.N
    if c != t {
        b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}
func BenchmarkStringCheckLen(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        for _, s := range ss {
            if len(s) == 0 {
                c++
            }
        }
    }
    t := 2 * b.N
    if c != t {
        b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}
func BenchmarkStringCheckLenGt(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        for _, s := range ss {
            if len(s) > 0 {
                c++
            }
        }
    }
    t := 6 * b.N
    if c != t {
        b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}
func BenchmarkStringCheckNe(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        for _, s := range ss {
            if s != "" {
                c++
            }
        }
    }
    t := 6 * b.N
    if c != t {
        b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}

结果如下:

% for a in $(seq 50);do go test -run=^$ -bench=. --benchtime=1s ./...|grep Bench;done | tee -a log
% sort -k 3n log | head -10
BenchmarkStringCheckEq-4      	150149937	         8.06 ns/op
BenchmarkStringCheckLenGt-4   	147926752	         8.06 ns/op
BenchmarkStringCheckLenGt-4   	148045771	         8.06 ns/op
BenchmarkStringCheckNe-4      	145506912	         8.06 ns/op
BenchmarkStringCheckLen-4     	145942450	         8.07 ns/op
BenchmarkStringCheckEq-4      	146990384	         8.08 ns/op
BenchmarkStringCheckLenGt-4   	149351529	         8.08 ns/op
BenchmarkStringCheckNe-4      	148212032	         8.08 ns/op
BenchmarkStringCheckEq-4      	145122193	         8.09 ns/op
BenchmarkStringCheckEq-4      	146277885	         8.09 ns/op

有效的变体通常无法达到最快的速度,而且变体之间的差异只有很小的差异(约为0.01ns/op)。

如果我查看完整的日志,尝试之间的差异大于基准函数之间的差异。

而且,BenchmarkStringCheckEq和BenchmarkStringCheckNe之间似乎没有任何可测量的差异,BenchmarkStringCheckLen和BenchmarkStringCheckLenGt之间也是如此,即使后者的变体应该将c增加6次而不是2次。

你可以尝试通过添加修改后的测试或内部循环来获得一些关于相等性能的信心。这样更快:

func BenchmarkStringCheckNone4(b *testing.B) {
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        for _, _ = range ss {
            c++
        }
    }
    t := len(ss) * b.N
    if c != t {
        b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}

这样不会更快:

func BenchmarkStringCheckEq3(b *testing.B) {
    ss2 := make([]string, len(ss))
    prefix := "a"
    for i, _ := range ss {
        ss2[i] = prefix + ss[i]
    }
    c := 0
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        for _, s := range ss2 {
            if s == prefix {
                c++
            }
        }
    }
    t := 2 * b.N
    if c != t {
        b.Fatalf("did not catch empty strings: %d != %d", c, t)
    }
}

这两种变体通常比主要测试之间的差异更快或更慢。

最好使用具有相关分布的字符串生成器生成测试字符串(ss),并且还具有可变长度。

所以我对于在Go中测试空字符串的主要方法之间的性能差异没有任何信心。

我可以有一些信心地说,不测试空字符串比测试空字符串更快。而且测试空字符串比测试1个字符的字符串(前缀变体)更快。

英文:

Just to add more to comment

Mainly about how to do performance testing.

I did testing with following code:

import (
&quot;testing&quot;
)
var ss = []string{&quot;Hello&quot;, &quot;&quot;, &quot;bar&quot;, &quot; &quot;, &quot;baz&quot;, &quot;ewrqlosakdjhf12934c r39yfashk fjkashkfashds fsdakjh-&quot;, &quot;&quot;, &quot;123&quot;}
func BenchmarkStringCheckEq(b *testing.B) {
c := 0
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
for _, s := range ss {
if s == &quot;&quot; {
c++
}
}
} 
t := 2 * b.N
if c != t {
b.Fatalf(&quot;did not catch empty strings: %d != %d&quot;, c, t)
}
}
func BenchmarkStringCheckLen(b *testing.B) {
c := 0
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
for _, s := range ss { 
if len(s) == 0 {
c++
}
}
} 
t := 2 * b.N
if c != t {
b.Fatalf(&quot;did not catch empty strings: %d != %d&quot;, c, t)
}
}
func BenchmarkStringCheckLenGt(b *testing.B) {
c := 0
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
for _, s := range ss {
if len(s) &gt; 0 {
c++
}
}
} 
t := 6 * b.N
if c != t {
b.Fatalf(&quot;did not catch empty strings: %d != %d&quot;, c, t)
}
}
func BenchmarkStringCheckNe(b *testing.B) {
c := 0
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
for _, s := range ss {
if s != &quot;&quot; {
c++
}
}
} 
t := 6 * b.N
if c != t {
b.Fatalf(&quot;did not catch empty strings: %d != %d&quot;, c, t)
}
}

And results were:

% for a in $(seq 50);do go test -run=^$ -bench=. --benchtime=1s ./...|grep Bench;done | tee -a log
% sort -k 3n log | head -10
BenchmarkStringCheckEq-4      	150149937	         8.06 ns/op
BenchmarkStringCheckLenGt-4   	147926752	         8.06 ns/op
BenchmarkStringCheckLenGt-4   	148045771	         8.06 ns/op
BenchmarkStringCheckNe-4      	145506912	         8.06 ns/op
BenchmarkStringCheckLen-4     	145942450	         8.07 ns/op
BenchmarkStringCheckEq-4      	146990384	         8.08 ns/op
BenchmarkStringCheckLenGt-4   	149351529	         8.08 ns/op
BenchmarkStringCheckNe-4      	148212032	         8.08 ns/op
BenchmarkStringCheckEq-4      	145122193	         8.09 ns/op
BenchmarkStringCheckEq-4      	146277885	         8.09 ns/op

Effectively variants usually do not reach fastest time and there is only minimal difference (about 0.01ns/op) between variant top speed.

And if I look full log, difference between tries is greater than difference between benchmark functions.

Also there does not seem to be any measurable difference between
BenchmarkStringCheckEq and BenchmarkStringCheckNe
or BenchmarkStringCheckLen and BenchmarkStringCheckLenGt
even if latter variants should inc c 6 times instead of 2 times.

You can try to get some confidence about equal performance by adding tests with modified test or inner loop. This is faster:

func BenchmarkStringCheckNone4(b *testing.B) {
c := 0
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
for _, _ = range ss {
c++
}
}
t := len(ss) * b.N
if c != t {
b.Fatalf(&quot;did not catch empty strings: %d != %d&quot;, c, t)
}
}

This is not faster:

func BenchmarkStringCheckEq3(b *testing.B) {
ss2 := make([]string, len(ss))
prefix := &quot;a&quot;
for i, _ := range ss {
ss2[i] = prefix + ss[i]
}
c := 0
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
for _, s := range ss2 {
if s == prefix {
c++
}
}
}
t := 2 * b.N
if c != t {
b.Fatalf(&quot;did not catch empty strings: %d != %d&quot;, c, t)
}
}

Both variants are usually faster or slower than difference between main tests.

It would also good to generate test strings (ss) using string generator with relevant distribution. And have variable lengths too.

So I don't have any confidence of performance difference between main methods to test empty string in go.

And I can state with some confidence, it is faster not to test empty string at all than test empty string. And also it is faster to test empty string than to test 1 char string (prefix variant).

答案9

得分: 0

使用下面这样的函数会更简洁且不容易出错:

func empty(s string) bool {
    return len(strings.TrimSpace(s)) == 0
}
英文:

It would be cleaner and less error-prone to use a function like the one below:

func empty(s string) bool {
return len(strings.TrimSpace(s)) == 0
}

答案10

得分: -1

这将比修剪整个字符串更高效,因为你只需要检查是否存在至少一个非空格字符。

// Strempty 检查字符串是否只包含空白字符
func Strempty(s string) bool {
if len(s) == 0 {
return true
}

r := []rune(s)
l := len(r)
for l > 0 {
l--
if !unicode.IsSpace(r[l]) {
return false
}
}
return true

}

英文:

This would be more performant than trimming the whole string, since you only need to check for at least a single non-space character existing

// Strempty checks whether string contains only whitespace or not
func Strempty(s string) bool {
if len(s) == 0 {
return true
}
r := []rune(s)
l := len(r)
for l &gt; 0 {
l--
if !unicode.IsSpace(r[l]) {
return false
}
}
return true
}

答案11

得分: -3

我认为最好的方法是与空字符串进行比较。

BenchmarkStringCheck1正在与空字符串进行比较。

BenchmarkStringCheck2正在检查长度为零的字符串。

我进行了空字符串和非空字符串的检查。你可以看到,与空字符串进行比较更快。

BenchmarkStringCheck1-4   	2000000000	         0.29 ns/op	       0 B/op	       0 allocs/op
BenchmarkStringCheck1-4   	2000000000	         0.30 ns/op	       0 B/op	       0 allocs/op
BenchmarkStringCheck2-4   	2000000000	         0.30 ns/op	       0 B/op	       0 allocs/op
BenchmarkStringCheck2-4   	2000000000	         0.31 ns/op	       0 B/op	       0 allocs/op

代码

func BenchmarkStringCheck1(b *testing.B) {
s := "Hello"
b.ResetTimer()
for n := 0; n < b.N; n++ {
if s == "" {
}
}
}
func BenchmarkStringCheck2(b *testing.B) {
s := "Hello"
b.ResetTimer()
for n := 0; n < b.N; n++ {
if len(s) == 0 {
}
}
}
英文:

I think the best way is to compare with blank string

BenchmarkStringCheck1 is checking with blank string

BenchmarkStringCheck2 is checking with len zero

I check with the empty and non-empty string checking. You can see that checking with a blank string is faster.

BenchmarkStringCheck1-4   	2000000000	         0.29 ns/op	       0 B/op	       0 allocs/op
BenchmarkStringCheck1-4   	2000000000	         0.30 ns/op	       0 B/op	       0 allocs/op
BenchmarkStringCheck2-4   	2000000000	         0.30 ns/op	       0 B/op	       0 allocs/op
BenchmarkStringCheck2-4   	2000000000	         0.31 ns/op	       0 B/op	       0 allocs/op

Code

func BenchmarkStringCheck1(b *testing.B) {
s := &quot;Hello&quot;
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
if s == &quot;&quot; {
}
}
}
func BenchmarkStringCheck2(b *testing.B) {
s := &quot;Hello&quot;
b.ResetTimer()
for n := 0; n &lt; b.N; n++ {
if len(s) == 0 {
}
}
}

huangapple
  • 本文由 发表于 2013年9月3日 22:02:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/18594330.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定