从标准输入中读取数据在Go中如何实现?

huangapple go评论98阅读模式
英文:

Read from initial stdin in Go?

问题

我想从Go程序的原始stdin中读取。例如,如果我执行echo test stdin | go run test.go,我希望能够访问"test stdin"。我尝试过从os.Stdin中读取,但如果其中没有内容,它会等待输入。我还尝试先检查大小,但即使有输入传入,os.Stdin.Stat().Size()也为0。

我该怎么办?

英文:

I would like to read from the original stdin of a Go program. For example, if I did echo test stdin | go run test.go, I would want to have access to "test stdin". I've tried reading from os.Stdin, but if there's nothing in it, then it will wait for input. I also tried checking the size first, but the os.Stdin.Stat().Size() is 0 even when input is passed in.

What can I do?

答案1

得分: 79

使用os.Stdin从标准输入读取应该按预期工作:

package main

import "os"
import "log"
import "io"

func main() {
	bytes, err := io.ReadAll(os.Stdin)

	log.Println(err, string(bytes))
}

执行echo test stdin | go run stdin.go应该正常打印出'test stdin'。

如果您附上用于识别遇到的问题的代码,将会很有帮助。

对于基于行的读取,您可以使用bufio.Scanner

import "os"
import "log"
import "bufio"

func main() {
	s := bufio.NewScanner(os.Stdin)
	for s.Scan() {
		log.Println("line", s.Text())
	}
}
英文:

Reading from stdin using os.Stdin should work as expected:

package main

import "os"
import "log"
import "io"

func main() {
	bytes, err := io.ReadAll(os.Stdin)

	log.Println(err, string(bytes))
}

Executing echo test stdin | go run stdin.go should print 'test stdin' just fine.

It would help if you'd attach the code you used to identify the problem you encountered.

For line based reading you can use bufio.Scanner:

import "os"
import "log"
import "bufio"

func main() {
	s := bufio.NewScanner(os.Stdin)
	for s.Scan() {
		log.Println("line", s.Text())
	}
}

答案2

得分: 23

我认为你的问题本身没有明智的答案,因为根本就没有所谓的“初始stdin”。类Unix的操作系统和Windows实现了“标准流”的概念,它的工作原理如下(简化):当一个进程被创建时,它自动拥有三个文件描述符(在Windows中是句柄)打开——stdin、stdout和stderr。毫无疑问,你对这个概念很熟悉,但我想强调一下“流”这个词的意义——在你的例子中,当你调用

$ echo 'test stdin' | ./stdin

shell创建了一个管道,生成了两个进程(一个用于echo,一个用于你的二进制文件),并利用它创建的管道:管道的写入文件描述符连接到echo的stdout,管道的读取文件描述符连接到你的二进制文件的stdin。然后,echo进程愿意写入其stdout的任何内容都会被管道传输(sic!)到你的进程的stdin。
(实际上,大多数现代shell将echo实现为内置原语,但这并不以任何方式改变语义;你也可以尝试使用/bin/echo,它是一个真正的程序。还要注意,我只是用./stdin来指代你的程序——这是为了清晰起见,因为go run stdin.go最终会做到这一点。)

这里有几个关键的事情:

  • 写入进程(在你的例子中是echo)不一定要向其stdout写入任何内容(例如,echo -n不会向其stdout写入任何内容并成功退出)。
  • 它也可以任意延迟写入数据(无论是因为它想要这样延迟还是因为它被操作系统抢占或在某个系统资源上等待某个系统调用等)。
  • 操作系统会缓冲管道传输,程序中的代码通常也会这样做,尽管对于经验不足的程序员来说可能不明显。这意味着写入进程发送到管道的内容可能以任意大小的块出现在读取端。(1)
  • 只有两种方法可以知道写入端没有更多数据要发送到管道:
    • 在数据本身中编码这一点(这意味着在写入者和读取者之间使用约定的数据传输协议)。
    • 写入者可能关闭管道的一侧,这将导致读取端出现“文件结束”条件(但只有在缓冲区被耗尽并尝试另一次read调用失败后才会发生)。

让我们总结一下:你观察到的行为是正确和正常的。如果你希望从stdin获取任何数据,你不能指望它立即可用。如果你也不想在stdin上阻塞,那么可以创建一个goroutine,在一个无限循环中进行阻塞读取stdin(但要检查EOF条件),并通过通道传递收集到的数据(如果需要,可能还要进行某些处理)。

(1)这就是为什么在管道中通常出现在两个管道之间的某些工具(例如grep)可能具有特殊选项,使它们在写入每一行后刷新其stdout——在grep手册页中阅读有关--line-buffered选项的内容就是一个例子。对于不了解这种“默认完全缓冲”的语义的人来说,当明显更新监视的文件时,“tail -f /path/to/some/file.log | grep whatever | sed ...”似乎停滞不前,不显示任何内容。


顺便说一下:如果你按照以下方式运行你的二进制文件“as is”,就像这样:

$ ./stdin

这并不意味着生成的进程没有stdin(或“初始stdin”或其他什么),相反,它的stdin将连接到你的shell接收键盘输入的相同流(所以你可以直接在进程的stdin中键入一些内容)。

确保进程的stdin连接到空处的唯一可靠方法是在类Unix的操作系统上使用

$ ./stdin </dev/null

和Windows上的

C:> stdin <NUL

这个“空设备”使进程在第一次从stdin读取时看到EOF。

英文:

I think your question per se has no sensible answer because there's just no such thing as "initial stdin". Unix-like OSs, and Windows implement the concept of "standard streams", which works like this (simplified): when a process is created, it automagically has three file descriptors (handles in Windows) open &mdash; stdin, stdout and stderr. No doubts, you're familiar with this concept, but I'd like to stress the meaning of the word "stream" there &mdash; in your example, when you call

$ echo &#39;test stdin&#39; | ./stdin

the shell creates a pipe, spawns two processes (one for echo and one for your binary) and makes use of the pipe it created: the pipe's write FD is attached to the echo's stdout and the pipe's read FD is attached to your binary's stdin. Then whatever the echo process pleases to write to its stdout is piped (sic!) to the stdin of your process.
(In reality most today's shells implement echo as a built-in primitive but this does not in any way change the semantics; your could as well have tried /bin/echo instead, which is a real program. Also note that I just used ./stdin to refer to your program &mdash; this is for clarity, as go run stdin.go would do exactly this, in the end.)

Note several crucial things here:

  • The writing process (echo in your case) is not oblidged to write anything to its stdout (for instance, echo -n would not write anything to its stdout and exit successfully).
  • It's also able to make arbitrary delays writing its data (either because it wants to make such delays or because it has been preempted by the OS or sleeps in some syscall waiting on some busy system resource etc).
  • The OS buffers transfers over pipes, and often the code in the program does this, too—though it may not be apparent for an inexperienced programmer. This means what the writing process sends to a pipe, might come out in arbitrary chunks on the reading side.<sup>1</sup>
  • There are only two ways to know the writing side has no more data to send over the pipe:
    • Somehow encode this in the data itself (this means using an agreed upon data transfer protocol between the writer and the reader).
    • The writer might close its side of the pipe which would result in the "end of file" condition on the reader side (but only after the buffer is drained and one another call to read is attempted, which fails).

Let's wrap this up: the behaviour you're observing is correct and normal. If you expect to get any data from stdin, you must not expect it to be readily available. If you also don't want to block on stdin, then create a goroutine which would do blocking reads from stdin in an endless loop (but checking for the EOF condition) and pass collected data up over a channel (possibly after certain processing, if needed).

<sup>1</sup> This is why certain tools which usually occur between two pipes in a pipeline, such as grep, might have special options to make them flush their stdout after writing each line &mdash; read about the --line-buffered option in the grep manual page for one example. People who are not aware of this "full buffering by default" semantics are puzzled why tail -f /path/to/some/file.log | grep whatever | sed ... seems to stall and not display anything when it's obvious the monitored file gets updated.


As a side note: if you were to run your binary "as is", like in

$ ./stdin

that would not meant the spawned process would not have stdin (or "initial stdin" or whaveter), instead, its stdin would be connected to the same stream your shell receives your keyboard input from (so you could directly type something to your process's stdin).

The only sure way to have a process's stdin connected to nowhere is to use

$ ./stdin &lt;/dev/null

on Unix-like OSes and

C:\&gt; stdin &lt;NUL

on Windows. This "null device" makes the process see EOF on the first read from its stdin.

答案3

得分: 6

你不能检查标准输入的内容,但你可以检查标准输入是否与终端或管道相关联。IsTerminal只接受标准Unix文件描述符号(0,1,2)。syscall包有分配的变量,所以你可以使用syscall.Stdin来命名它们。

package main

import (
	"code.google.com/p/go.crypto/ssh/terminal"
	"fmt"
	"io/ioutil"
	"os"
)

func main() {
	if !terminal.IsTerminal(0) {
		b, _ := ioutil.ReadAll(os.Stdin)
		fmt.Print(string(b))
	} else {
		fmt.Println("no piped data")
	}
}
英文:

You can't check stdin for content, but you can check if stdin is associated with a terminal or a pipe. IsTerminal just takes the standard unix fd numbers (0,1,2). The syscall package has variables assigned so you can do syscall.Stdin if you prefer naming them.

package main

import (
	&quot;code.google.com/p/go.crypto/ssh/terminal&quot;
	&quot;fmt&quot;
	&quot;io/ioutil&quot;
	&quot;os&quot;
)

func main() {
	if ! terminal.IsTerminal(0) {
		b, _ := ioutil.ReadAll(os.Stdin)
		fmt.Print(string(b))
	} else {
		fmt.Println(&quot;no piped data&quot;)
	}
}

huangapple
  • 本文由 发表于 2012年9月11日 13:01:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/12363030.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定