将一个命令行字符串解析为 Golang 中的标志和参数

huangapple go评论94阅读模式
英文:

Parse a command line string into flags and arguments in Golang

问题

我正在寻找一个能够将字符串(例如-v --format "some example" -i test)解析成字符串切片的包,能够正确处理引号、空格等情况:

-v
--format
some example
-i
test

我已经查看了内置的flag包以及GitHub上的其他标志处理包,但似乎没有一个包能够处理将原始字符串解析为标记的这种特殊情况。在尝试自己实现之前,我更愿意寻找一个现成的包,因为我相信有很多特殊情况需要处理。

有什么建议吗?

英文:

I'm looking for a package that would take a string such as -v --format "some example" -i test and parse it into a slice of strings, handling quotes, spaces, etc. properly:

-v
--format
some example
-i
test

I've checked the built-in flag package as well as other flag handling packages on Github but none of them seem to handle this particular case of parsing a raw string into tokens. Before trying to do it myself I'd rather look for a package as I'm sure there are a lot of special cases to handle.

Any suggestion?

答案1

得分: 14

看起来类似于 shlex

import "github.com/google/shlex"
shlex.Split("one \"two three\" four") -> []string{"one", "two three", "four"}
英文:

Looks similar to shlex:

import "github.com/google/shlex"
shlex.Split("one \"two three\" four") -> []string{"one", "two three", "four"}

答案2

得分: 11

以下是我翻译好的内容:

这是我最终创建的函数。

它将一个命令分割成参数。例如,cat -v "some file.txt",将返回["cat", "-v", "some file.txt"]

它还正确处理转义字符,特别是空格。因此,cat -v some\ file.txt也会被正确分割为["cat", "-v", "some file.txt"]

func parseCommandLine(command string) ([]string, error) {
    var args []string
    state := "start"
    current := ""
    quote := "\""
    escapeNext := true
    for i := 0; i < len(command); i++ {
        c := command[i]

        if state == "quotes" {
            if string(c) != quote {
                current += string(c)
            } else {
                args = append(args, current)
                current = ""
                state = "start"
            }
            continue
        }

        if escapeNext {
            current += string(c)
            escapeNext = false
            continue
        }

        if c == '\\' {
            escapeNext = true
            continue
        }

        if c == '"' || c == '\'' {
            state = "quotes"
            quote = string(c)
            continue
        }

        if state == "arg" {
            if c == ' ' || c == '\t' {
                args = append(args, current)
                current = ""
                state = "start"
            } else {
                current += string(c)
            }
            continue
        }

        if c != ' ' && c != '\t' {
            state = "arg"
            current += string(c)
        }
    }

    if state == "quotes" {
        return []string{}, errors.New(fmt.Sprintf("Unclosed quote in command line: %s", command))
    }

    if current != "" {
        args = append(args, current)
    }

    return args, nil
}
英文:

For information, this is the function I've ended up creating.

It splits a command into its arguments. For example, cat -v &quot;some file.txt&quot;, will return [&quot;cat&quot;, &quot;-v&quot;, &quot;some file.txt&quot;].

It also correctly handles escaped characters, spaces in particular. So cat -v some\ file.txt will also correctly be split into [&quot;cat&quot;, &quot;-v&quot;, &quot;some file.txt&quot;]

func parseCommandLine(command string) ([]string, error) {
var args []string
state := &quot;start&quot;
current := &quot;&quot;
quote := &quot;\&quot;&quot;
escapeNext := true
for i := 0; i &lt; len(command); i++ {
c := command[i]
if state == &quot;quotes&quot; {
if string(c) != quote {
current += string(c)
} else {
args = append(args, current)
current = &quot;&quot;
state = &quot;start&quot;
}
continue
}
if (escapeNext) {
current += string(c)
escapeNext = false
continue
}
if (c == &#39;\\&#39;) {
escapeNext = true
continue
}
if c == &#39;&quot;&#39; || c == &#39;\&#39;&#39; {
state = &quot;quotes&quot;
quote = string(c)
continue
}
if state == &quot;arg&quot; {
if c == &#39; &#39; || c == &#39;\t&#39; {
args = append(args, current)
current = &quot;&quot;
state = &quot;start&quot;
} else {
current += string(c)
}
continue
}
if c != &#39; &#39; &amp;&amp; c != &#39;\t&#39; {
state = &quot;arg&quot;
current += string(c)
}
}
if state == &quot;quotes&quot; {
return []string{}, errors.New(fmt.Sprintf(&quot;Unclosed quote in command line: %s&quot;, command))
}
if current != &quot;&quot; {
args = append(args, current)
}
return args, nil
}

答案3

得分: 4

如果参数是通过命令行传递给您的程序的,那么shell应该会处理这个,并且os.Args会被正确填充。例如,在您的情况下,os.Args[1:]将等于

[]string{"-v", "--format", "some example", "-i", "test"}

但是,如果您只有一个字符串,并且出于某种原因,您想模拟shell对它的处理方式,那么我建议使用一个类似于https://github.com/kballard/go-shellquote的包。

英文:

If the args were passed to your program on the command line then the shell should handle this and os.Args will be populated correctly. For example, in your case os.Args[1:] will equal

[]string{&quot;-v&quot;, &quot;--format&quot;, &quot;some example&quot;, &quot;-i&quot;, &quot;test&quot;}

If you just have the string though, for some reason, and you'd like to mimic what the shell would do with it, then I recommend a package like https://github.com/kballard/go-shellquote

答案4

得分: 2

@laurent的答案很好,但是当command包含UTF-8字符时,它不起作用。

它在第三个测试中失败:

func TestParseCommandLine(t *testing.T){
	tests := []struct{
		name string
		input string
		want []string
	}{
		{
			"normal",
			"hello world",
			[]string{"hello", "world"},
		},
		{
			"quote",
			"hello \"world hello\"",
			[]string{"hello", "world hello"},
		},
		{
			"utf-8",
			"hello 世界",
			[]string{"hello", "世界"},
		},
		{
			"space",
			"hello\\ world",
			[]string{"hello world"},
		},
	}
	for _, tt := range tests{
		t.Run(tt.name, func(t *testing.T) {
			got, _ := parseCommandLine(tt.input)
			if !reflect.DeepEqual(got, tt.want){
				t.Errorf("expect %v, got %v", tt.want, got)
			}
		})
	}
}

根据他/她的答案,我编写了这个函数,对于UTF-8工作得很好,只需将for i := 0; i < len(command); i++ {c := command[i]替换为for _, c := range command

这是我的答案:

func parseCommandLine(command string) ([]string, error) {
	var args []string
	state := "start"
	current := ""
	quote := "\""
	escapeNext := true
	for _, c := range command {

		if state == "quotes" {
			if string(c) != quote {
				current += string(c)
			} else {
				args = append(args, current)
				current = ""
				state = "start"
			}
			continue
		}

		if escapeNext {
			current += string(c)
			escapeNext = false
			continue
		}

		if c == '\\' {
			escapeNext = true
			continue
		}

		if c == '"' || c == '\'' {
			state = "quotes"
			quote = string(c)
			continue
		}

		if state == "arg" {
			if c == ' ' || c == '\t' {
				args = append(args, current)
				current = ""
				state = "start"
			} else {
				current += string(c)
			}
			continue
		}

		if c != ' ' && c != '\t' {
			state = "arg"
			current += string(c)
		}
	}

	if state == "quotes" {
		return []string{}, errors.New(fmt.Sprintf("Unclosed quote in command line: %s", command))
	}

	if current != "" {
		args = append(args, current)
	}

	return args, nil
}
英文:

@laurent 's answer is wonderful, but it doesn't work when command includes utf-8 char.

It fail the third test:

func TestParseCommandLine(t *testing.T){
	tests := []struct{
		name string
		input string
		want []string
	}{
		{
			&quot;normal&quot;,
			&quot;hello world&quot;,
			[]string{&quot;hello&quot;, &quot;world&quot;},
		},
		{
			&quot;quote&quot;,
			&quot;hello \&quot;world hello\&quot;&quot;,
			[]string{&quot;hello&quot;, &quot;world hello&quot;},
		},
		{
			&quot;utf-8&quot;,
			&quot;hello 世界&quot;,
			[]string{&quot;hello&quot;, &quot;世界&quot;},
		},
		{
			&quot;space&quot;,
			&quot;hello\\ world&quot;,
			[]string{&quot;hello world&quot;},
		},
	}
	for _, tt := range tests{
		t.Run(tt.name, func(t *testing.T) {
			got, _ := parseCommandLine(tt.input)
			if !reflect.DeepEqual(got, tt.want){
				t.Errorf(&quot;expect %v, got %v&quot;, tt.want, got)
			}
		})
	}
}

Based on his/her answer, i wrote this func that works good for utf-8, just by replacing for i := 0; i &lt; len(command); i++ {c := command[i] to for _, c := range command

Here's the my answer:

func parseCommandLine(command string) ([]string, error) {
	var args []string
	state := &quot;start&quot;
	current := &quot;&quot;
	quote := &quot;\&quot;&quot;
	escapeNext := true
	for _, c := range command {

		if state == &quot;quotes&quot; {
			if string(c) != quote {
				current += string(c)
			} else {
				args = append(args, current)
				current = &quot;&quot;
				state = &quot;start&quot;
			}
			continue
		}

		if escapeNext {
			current += string(c)
			escapeNext = false
			continue
		}

		if c == &#39;\\&#39; {
			escapeNext = true
			continue
		}

		if c == &#39;&quot;&#39; || c == &#39;\&#39;&#39; {
			state = &quot;quotes&quot;
			quote = string(c)
			continue
		}

		if state == &quot;arg&quot; {
			if c == &#39; &#39; || c == &#39;\t&#39; {
				args = append(args, current)
				current = &quot;&quot;
				state = &quot;start&quot;
			} else {
				current += string(c)
			}
			continue
		}

		if c != &#39; &#39; &amp;&amp; c != &#39;\t&#39; {
			state = &quot;arg&quot;
			current += string(c)
		}
	}

	if state == &quot;quotes&quot; {
		return []string{}, errors.New(fmt.Sprintf(&quot;Unclosed quote in command line: %s&quot;, command))
	}

	if current != &quot;&quot; {
		args = append(args, current)
	}

	return args, nil
}

答案5

得分: 0

hedzr/cmdr可能是一个不错的选择。它是一个类似于getopt的命令行解析器,轻量级,具有流畅的API或经典风格。

英文:

hedzr/cmdr might be good. it's a getopt-like command-line parser, light weight, fluent api or classical style.

答案6

得分: 0

我知道这是一个旧问题,但可能仍然相关。使用正则表达式怎么样?它非常简单,对于大多数情况可能已经足够了:

r := regexp.MustCompile(`\&quot;[^\&quot;]+\&quot;|\S+`)
m := r.FindAllString(`-v --format &quot;some example&quot; -i test`, -1)
fmt.Printf("%q", m)
// 输出 [&quot;-v&quot; &quot;--format&quot; &quot;\&quot;some example\&quot;&quot; &quot;-i&quot; &quot;test&quot;]

你可以在 https://go.dev/play/p/1K0MlsOUzQI 上尝试。

编辑:

为了处理 test\ abc 作为一个条目,可以使用以下正则表达式:\&quot;[^\&quot;]+\&quot;|\S+\\\s\S+|\S+

英文:

I know this is an old question, but might be still relevant. What about using regex? It is quite simple and might be enough for most of cases:

r := regexp.MustCompile(`\&quot;[^\&quot;]+\&quot;|\S+`)
m := r.FindAllString(`-v --format &quot;some example&quot; -i test`, -1)
fmt.Printf(&quot;%q&quot;, m)
// Prints out [&quot;-v&quot; &quot;--format&quot; &quot;\&quot;some example\&quot;&quot; &quot;-i&quot; &quot;test&quot;]

You can try https://go.dev/play/p/1K0MlsOUzQI

Edit:

To handle also test\ abc to be a 1 entry, use this regex: \&quot;[^\&quot;]+\&quot;|\S+\\\s\S+|\S+

huangapple
  • 本文由 发表于 2015年12月6日 22:53:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/34118732.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定