英文:
How to read and format a stream of text received through a bash pipe?
问题
目前,我正在使用以下代码来格式化来自我的npm脚本的数据。
npm run startWin | while IFS= read -r line; do printf '%b\n' "$line"; done | less
这个方法是有效的,但是我的同事们不使用Linux。因此,我想在Go中实现while IFS= read -r line; do printf '%b\n' "$line"; done
,并将其作为管道中的二进制文件使用。
npm run startWin | magical-go-formater
我尝试过的方法
package main
import (
"fmt"
"io/ioutil"
"os"
"strings"
)
func main() {
fi, _ := os.Stdin.Stat() // 获取FileInfo结构体
if (fi.Mode() & os.ModeCharDevice) == 0 {
bytes, _ := ioutil.ReadAll(os.Stdin)
str := string(bytes)
arr := strings.Fields(str)
for _, v := range arr {
fmt.Println(v)
}
}
目前,该程序会将文本流的任何输出静音。
英文:
Currently, I'm using the following to format data from my npm script.
npm run startWin | while IFS= read -r line; do printf '%b\n' "$line"; done | less
It works, but my colleagues do not use Linux. So, I would like to implement while IFS= read -r line; do printf '%b\n' "$line"; done
in Go, and use the binary in the pipe.
npm run startWin | magical-go-formater
What I tried
package main
import (
"fmt"
"io/ioutil"
"os"
"strings"
)
func main() {
fi, _ := os.Stdin.Stat() // get the FileInfo struct
if (fi.Mode() & os.ModeCharDevice) == 0 {
bytes, _ := ioutil.ReadAll(os.Stdin)
str := string(bytes)
arr := strings.Fields(str)
for _, v := range arr {
fmt.Println(v)
}
}
Currently the program silences any output from the text-stream.
答案1
得分: 2
你想要使用bufio.Scanner来进行尾部读取。在我看来,你对os.Stdin
的检查是不必要的,但可能因人而异。
可以参考这个答案中的示例。ioutil.ReadAll()
(现在已弃用,直接使用io.ReadAll()
)会一直读取直到出现错误/EOF,但它不是一个循环输入的过程,这就是为什么你需要使用bufio.Scanner.Scan()
的原因。
另外,%b
会将文本中的任何转义序列转换为相应的字符 - 例如,传入行中的任何\n
都会被渲染为换行符 - 你需要这样吗?因为在Go中没有等效的格式说明符,据我所知。
编辑
所以我认为,基于你的ReadAll()
方法可能最终会起作用。我猜你期望的行为类似于使用bufio.Scanner
时的行为 - 接收进程会在写入时处理字节(实际上是一个轮询操作 - 可以查看标准库中Scan()
的源代码以了解详细信息)。
但是ReadAll()
会将所有读取的内容缓冲起来,直到最终获得错误或EOF才返回。我修改了一个带有仪器的ReadAll()
版本(这是标准库源代码的精确副本,只是增加了一些额外的仪器输出),你可以看到它在字节被写入时进行了读取,但直到写入过程完成并关闭管道的一端(即打开的文件句柄)才返回并提供内容,这时会生成EOF:
package main
import (
"fmt"
"io"
"os"
"time"
)
func main() {
// os.Stdin.SetReadDeadline(time.Now().Add(2 * time.Second))
b, err := readAll(os.Stdin)
if err != nil {
fmt.Println("ERROR: ", err.Error())
}
str := string(b)
fmt.Println(str)
}
func readAll(r io.Reader) ([]byte, error) {
b := make([]byte, 0, 512)
i := 0
for {
if len(b) == cap(b) {
// Add more capacity (let append pick how much).
b = append(b, 0)[:len(b)]
}
n, err := r.Read(b[len(b):cap(b)])
//fmt.Fprintf(os.Stderr, "READ %d - RECEIVED: \n%s\n", i, string(b[len(b):cap(b)]))
fmt.Fprintf(os.Stderr, "%s READ %d - RECEIVED %d BYTES\n", time.Now(), i, n)
i++
b = b[:len(b)+n]
if err != nil {
if err == io.EOF {
fmt.Fprintln(os.Stderr, "RECEIVED EOF")
err = nil
}
return b, err
}
}
}
我只是编写了一个简单的脚本来生成输入,模拟一个长时间运行并且只在定期间隔写入的情况,这是我想象中npm在你的情况下的行为:
#!/bin/sh
for x in 1 2 3 4 5 6 7 8 9 10
do
cat ./main.go
sleep 10
done
顺便说一下,我发现阅读实际的标准库代码非常有帮助...或者至少在这种情况下很有趣。
英文:
You want to use bufio.Scanner for tail-type reads. IMHO the checks you're doing on os.Stdin
are unnecessary, but YMMV.
See this answer for an example. ioutil.ReadAll()
(now deprecated, just use io.ReadAll()
) reads up to an error/EOF, but it is not a looping input - that's why you want bufio.Scanner.Scan()
.
Also - %b
will convert any escape sequence in the text - e.g. any \n
in a passed line will be rendered as a newline - do you need that? B/c go does not have an equivalent format specifier, AFAIK.
EDIT
So I think that, your ReadAll()
-based approach would/could have worked...eventually. I am guessing that you were expecting behavior like you get with bufio.Scanner
- the receiving process handles bytes as they are written (it's actually a polling operation - see the standard library source for Scan()
to see the grimy details).
But ReadAll()
buffers everything read and doesn't return until it finally gets either an error or an EOF. I hacked up an instrumented version of ReadAll()
(this is an exact copy of the standard library source with just a little bit of additional instrumentation output), and you can see that it's reading as the bytes are written, but it just doesn't return and yield the contents until the writing process is finished, at which time it closes its end of the pipe (its open filehandle), which generates the EOF:
package main
import (
"fmt"
"io"
"os"
"time"
)
func main() {
// os.Stdin.SetReadDeadline(time.Now().Add(2 * time.Second))
b, err := readAll(os.Stdin)
if err != nil {
fmt.Println("ERROR: ", err.Error())
}
str := string(b)
fmt.Println(str)
}
func readAll(r io.Reader) ([]byte, error) {
b := make([]byte, 0, 512)
i := 0
for {
if len(b) == cap(b) {
// Add more capacity (let append pick how much).
b = append(b, 0)[:len(b)]
}
n, err := r.Read(b[len(b):cap(b)])
//fmt.Fprintf(os.Stderr, "READ %d - RECEIVED: \n%s\n", i, string(b[len(b):cap(b)]))
fmt.Fprintf(os.Stderr, "%s READ %d - RECEIVED %d BYTES\n", time.Now(), i, n)
i++
b = b[:len(b)+n]
if err != nil {
if err == io.EOF {
fmt.Fprintln(os.Stderr, "RECEIVED EOF")
err = nil
}
return b, err
}
}
}
I just hacked up a cheap script to generate the input, simulating something long-running and writing only at periodic intervals, how I'd imagine npm is behaving in your case:
#!/bin/sh
for x in 1 2 3 4 5 6 7 8 9 10
do
cat ./main.go
sleep 10
done
As a side note, I find reading the actual standard library code really helpful...or at least interesting in cases like this.
答案2
得分: 0
@Sandy Cash在提到使用Bufio
时非常有帮助。我不知道为什么,如果@Jim说的是真的,但是Bufio
起作用了,而ReadAll()
没有起作用。
谢谢你的帮助。
代码:
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
s := scanner.Text()
arr := strings.Split(s, `\n`)
for _, v := range arr {
fmt.Println(v)
}
}
}
英文:
@Sandy Cash was helpful in stating to use Bufio
. I don't know why, if what @Jim said is true, but Bufio
worked out and ReadAll()
didn't.
Thanks for the help.
The code:
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
s := scanner.Text()
arr := strings.Split(s, `\n`)
for _, v := range arr {
fmt.Println(v)
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论