为什么在使用os/exec调用的命令中,’\’是无效的?

huangapple go评论74阅读模式
英文:

Why is '\' invalid in this command called with os/exec?

问题

当我执行这段用Go语言编写的代码时:

package main

import (
	"fmt"
	"os/exec"
)

func donde(num string) string {
	cmd := fmt.Sprintf("wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30\"|grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\"|grep -av \"href=\\\"/\"", num)
	out, err := exec.Command("bash", "-c", cmd).Output()
	if err != nil {
		return fmt.Sprintf("Failed to execute command: %s", cmd)
	}
	return string(out)
}

func main() {

	chicas := map[string][]string{"Alexia": {"600080000"},
		"Paola": {"600070008", "600050007", "600000005", "600000001", "600004", "600000000"}}

	for k, v := range chicas {
		fmt.Printf("%s\n", k)
		for index := range v {
			c := donde(v[index])
			exec.Command("bash", "-c", c)
			fmt.Println(c)
		}

	}

}

我得到了以下错误信息:

./favoritas.go:8:189: invalid operation: "wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18... / " (operator / not defined on untyped string)
./favoritas.go:8:190: invalid character U+005C '\'

grep -av \"href=\\\"/\" 似乎是问题所在。有趣的是,类似的Python代码可以正常工作:

from subprocess import run
v = "600000005"
dnd = run('wget -qO- "https://www.pasion.com/contactos-mujeres/'+v+'.htm?edadd=18&edadh=30" |grep -av "https:"|grep -av "contactos"|grep -av "javascript" |grep -av "href=\\\"/"' , capture_output=True, shell=True, text=True, encoding='latin-1').stdout
print(dnd)

而且从我的shell(我使用Bash)执行wget -qO- "https://www.pasion.com/contactos-mujeres/600000003.htm?edadd=18&edadh=30" |grep -av "https:"|grep -av "contactos"|grep -av "javascript" |grep -av "href=\\\"/"也可以正常工作。为什么我不能在我的Go代码中实现相同的效果?我该如何解决这个问题?

附注:这里粘贴的只是更长程序的片段。

英文:

When I execute this code written in Go:

package main

import ( "fmt" 
"os/exec"
)

func donde(num string) string {                                                                                                                                                         
  cmd := fmt.Sprintf("wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30\"|grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\"|grep -av \"href=\\"/\"", num)
  out, err := exec.Command("bash","-c",cmd).Output()
        if err != nil {
                return fmt.Sprintf("Failed to execute command: %s", cmd)
        }
        return string(out)
}

func main() {

chicas := map[string][]string{ "Alexia":{"600080000"}, 
"Paola":{"600070008", "600050007", "600000005", "600000001", "600004", "600000000"}}	

for k, v := range chicas { 
    fmt.Printf("%s\n", k)
    for index := range v {
		c := donde(v[index])
		exec.Command("bash", "-c", c)
		fmt.Println(c)}

  }
    
}

I get:

./favoritas.go:8:189: invalid operation: "wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18... / "" (operator / not defined on untyped string)
./favoritas.go:8:190: invalid character U+005C '\'

grep -av \"href=\\"/\" seems to be the culprit. Interestingly, similar Python code
works just fine:

from subprocess import run
v = "600000005"
dnd = run('wget -qO- \"https://www.pasion.com/contactos-mujeres/'+v+'.htm?edadd=18&edadh=30\" |grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\" |grep -av \"href=\\"/\"' , capture_output=True, shell=True, text=True, encoding='latin-1').stdout
print(dnd)

and wget -qO- "https://www.pasion.com/contactos-mujeres/600000003.htm?edadd=18&edadh=30" |grep -av "https:"|grep -av "contactos"|grep -av "javascript" |grep -av "href=\"/" executed from my shell (I use Bash) works fine as well.
Why cannot I accomplish the same in my code Go? How might I resolve this issue?

P.S. What is pasted here are just snippets of more lengthy programs.

答案1

得分: 2

转义语言内部的引号是困难的。在可能的情况下,使用替代语法来减轻这种痛苦。

你的语法很复杂,因为你选择用双引号引用字符串,但字符串中包含双引号,所以它们必须被转义。此外,字符串中还有双引号本身必须被转义。你已经转义了它们,但在转义结束时出现了一个错误:

"wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30\"|grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\"|grep -av \"href=\\"/\""

你转义了反斜杠,但没有包含额外的反斜杠来转义引号。所以引号结束了。/没有在字符串中被引用,因此被应用于引号字符串作为运算符。但是string没有/运算符,因此出现错误。

`wget -qO- "https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30"|grep -av "https:"|grep -av "contactos"|grep -av "javascript"|grep -av 'href="/'`

要点是:在适当的情况下使用反引号来引用包含引号的字符串,这样你就不需要在字符串内部转义引号。

此外,如果你在bash中使用单引号,它将禁用所有特殊字符,直到找到另一个单引号。grep -av 'href="/'更直接,不是吗?

要点是:在适当的情况下在bash中使用单引号来界定字面字符串

更好的做法是,除非你确实需要,否则不要使用外壳

你之所以遇到这些问题,是因为你将有效的bash代码尝试封装在另一种编程语言中。除非你确实需要,否则不要这样做。

考虑以下可能会让你的生活更轻松的替代方案:

  • 使用Go的net/http库而不是wget进行HTTP请求。

  • 使用https://pkg.go.dev/golang.org/x/net/html解析响应中的HTML,这比grep更可靠。HTML内容不适合使用grep。

英文:

escaping quotes within a language within a language is hard. Use alternate syntax when available to alleviate this pain.

Your syntax is complex because you chose to enquote the string with double quotes, but the string contains double quotes, so they must be escaped. Additionally, you have double quotes within the string that themselves must be escaped. You've escaped them, but made a typeo in your escaping at the end:

"wget -qO- \"https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30\"|grep -av \"https:\"|grep -av \"contactos\"|grep -av \"javascript\"|grep -av \"href=\\"/\""

you escaped the backslash, but did not include an additional backslash to escape the quote. So the quoted string ended. The / is not enquoted in the string, thus applied to the quoted string as an operator. But string has no / operator, hence the error.

`wget -qO- "https://www.pasion.com/contactos-mujeres/%s.htm?edadd=18&edadh=30"|grep -av "https:"|grep -av "contactos"|grep -av "javascript"|grep -av 'href="/'`

key takeaway: use backticks when appropriate to enquote strings that contain quotes, then you won't need to escape quotes within the string.

additionally, if you use single quote in bash, it will disable all special characters until another single quote is found. grep -av 'href="/' is more straightforward, no?

key takeaway: use single quotes in bash, when appropriate, to delineate literal strings

Better yet, don't shell out unless you really have to

all your pain here is because you took code that was valid in bash, and tried to encapsulate it within another programming language. don't do that unless you really have to.

consider an alternative here that might make your life easier:

  • Make the http request with Go's net/http library instead of wget.

  • Parse the HTML in the response with https://pkg.go.dev/golang.org/x/net/html which will be more robust than grep. HTML content does not grep well.

huangapple
  • 本文由 发表于 2022年2月24日 22:36:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/71253623.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定