英文:
Filtering non-json content in a json stream in Go
问题
我正在使用Go语言处理一个JSON结构的输入流。我从标准输入(stdin)接收输入流,无法更改通信协议。
我的问题是,每个JSON结构都以非JSON字符串行“end”(不带引号)结尾。
我正在使用Golang的encoder/json包来解码从标准输入接收到的JSON。问题是,当我第二次调用它并传入消息“invalid character 'e' looking for beginning of value”时,解码器会产生错误。
问题当然是,“end”字符串没有进行JSON编码。我想知道如何让Go的JSON解码器跳过这个字符串?
一些示例输入:
{"command": "ack", "id": "1231231"}
end
{"command": "fail", "id": "1231231"}
end
{
"command": "log",
// 要记录的消息
"msg": "hello world!"
}
end
我尝试过的方法:
- 我声明了:endStr := make([]byte, 10)
- 我尝试使用fmt.Fscanf(os.Stdin, "%s", endStr)来读取字符串,但没有读取到任何数据。
- 我尝试使用os.Stdin.Read(endStr),但也没有返回任何数据。
- 在读取第一个JSON结构后,dec.Buffered()返回一个包含“end”字符串的io.Reader,但我不知道如何告诉解码器跳过它。
任何帮助将不胜感激。
英文:
I'm working with an input stream of json structures in Go. I receive the input stream from another application on my stdin and I can't alter the communications protocol.
The problem I have is that every json structure is terminated by a non-json string line: "end" (without the quotes).
I'm using the Golang encoder/json package to decode the json I'm receiving from stdin. The problem is that the decoder produces an error the second time I call it with the msg: "invalid character 'e' looking for beginning of value".
The issue, of course is, that the "end" string is not json encoded. I would like to know how I can have Go's json decoder skip over this string?
Some sample input:
{"command": "ack", "id": "1231231"}
end
{"command": "fail", "id": "1231231"}
end
{
"command": "log",
// the message to log
"msg": "hello world!"
}
end
Things I've tried:
- I declared: endStr := make([]byte, 10)
- I've tried to use fmt.Fscanf(os.Stdin, "%s", endStr), to read past the string, but no data are read.
- I've tried to use os.Stdin.Read(endStr), but it also returns no data.
- After I read the first json structure, dec.Buffered() returns an io.Reader containing the "end" string, but I don't know how to tell the decoder to skip over this.
Any help would be appreciated.
答案1
得分: 3
所以我能想到的最好的解决方案是:
- 放弃json解码器,
- 从标准输入读取一个字节切片,
- 剪切切片以排除("\nend\n")字符字符串
- 将修剪后的切片传递给json解码器
我需要编写的代码:
// 创建一个缓冲区来保存流数据
data := make([]byte, 5000)
// 循环从标准输入读取数据
for {
_, err = os.Stdin.Read(data)
if err != nil {
panic(err)
}
index := bytes.Index(data, []byte("\n"))
data = data[:index]
var myStruct MyStruct
err = json.Unmarshal(data, &myStruct)
if err != nil {
panic(err)
}
//(对myStruct进行操作)
}
英文:
So the best solution I've been able to come up with is:
- Ditch the json Decoders,
- read a byte slice from stdin,
- trim the slice to exclude the ("\nend\n") character string
- pass the trimmed slice to a json Unmarshaller
The code I had to write:
// Create a buffer to hold the stream data
data := make([]byte, 5000)
// Read data from stdin in a loop
for {
_, err = os.Stdin.Read(data)
if err != nil {
panic(err)
}
index := bytes.Index(data, []byte("\n"))
data = data[:index]
var myStruct MyStruct
err = json.Unmarshal(data, &myStruct)
if err != nil {
panic(err)
}
//(Do something with myStruct)
}
答案2
得分: 2
package main
import "fmt"
import "encoding/json"
import "bytes"
import "io"
import "bufio"
var input = {"command": "ack", "id": "1231231"} end {"command": "fail", "id": "1231231"} end
func main() {
var input = bytes.NewBuffer([]byte(input))
var buf = bufio.NewReader(input)
var res map[string]interface{}
var err error
var dec *json.Decoder
for err == nil {
if dec != nil {
buf = bufio.NewReader(io.MultiReader(dec.Buffered(), buf))
}
dropEnd(buf)
dec = json.NewDecoder(buf)
if err = dec.Decode(&res); err == nil {
fmt.Println("Read:", res)
}
}
if err != io.EOF {
fmt.Println("Unexpected error:", err)
}
}
func dropEnd(buf *bufio.Reader) {
var check = make([]byte, 4)
if check, _ = buf.Peek(4); bytes.Contains(check, []byte("end")) {
buf.Read(check)
}
}
英文:
This is messy, but it will do the trick:
package main
import "fmt"
import "encoding/json"
import "bytes"
import "io"
import "bufio"
var input = `{"command": "ack", "id": "1231231"}
end
{"command": "fail", "id": "1231231"}
end
`
func main() {
// make an io.Reader out of our input constant
var input = bytes.NewBuffer([]byte(input))
// we're going to need a buffered reader so we can Peek
var buf = bufio.NewReader(input)
// This is the result of the decode. Use whatever makes sense for you
var res map[string]interface{}
var err error
var dec *json.Decoder
// We're going to loop until we get an error (hopefully it will be io.EOF
for err == nil {
if dec != nil {
// This is the tricky bit: json.Decoder has its own buffer.
// it will read more than the data it needs. In my simple test,
// it buffers all of the data. What we're doing here is constructing
// a new bufio.Reader using the remaining bytes in the json decoder's buffer
// and whatever hasn't been read from our original buffer.
buf = bufio.NewReader(io.MultiReader(dec.Buffered(), buf))
}
// Now let's try to drop an 'end' statement from the buffer
dropEnd(buf)
// We need a new json.Decoder each time since the old one contains unusable
// data in its internal buffer.
dec = json.NewDecoder(buf)
// do the decode
if err = dec.Decode(&res); err == nil {
fmt.Println("Read:", res)
}
}
if err != io.EOF {
fmt.Println("Unexpected error:", err)
}
}
func dropEnd(buf *bufio.Reader) {
var check = make([]byte, 4)
// If the next 4 bytes (either "\nend" or "end\n") contain "end", drop read them off the buffer
if check, _ = buf.Peek(4); bytes.Contains(check, []byte("end")) {
buf.Read(check)
}
}
You can play with this code here: http://play.golang.org/p/7NER_fTzXI
答案3
得分: 1
你可以将os.Stdin
包装在bufio
包的bufio.Reader中。然后使用buf.Peek(num)在解码之前先查看一下。
你也可以使用自定义的Scanner来分隔JSON块。
使用bufio
而不是静态缓冲区的好处是它可以在流上工作。
英文:
You can wrap your os.Stdin
in bufio.Reader from the bufio
package. Then use buf.Peek(num) to look ahead before you Decode.
You can also use a custom Scanner to delimit the JSON chunks.
What's nice about using bufio
vs a static buffer is it'll work on a stream.
答案4
得分: 0
如果您可以将JSON对象限制为一行,您只需要按行分割输入并忽略无法解组的部分。这是一小段代码。
package main
import (
"encoding/json"
"fmt"
"strings"
)
var input = `{"command": "ack", "id": "1231231"}
end
{"command": "fail", "id": "1231231"}
end
`
type Obj struct {
Command string
Msg string
Id string
}
func DoSomethingCool(o Obj) {
// 在这里做一些酷炫的事情
fmt.Println(o)
}
func main() {
inputs := strings.Split(input, "\n")
for _, v := range inputs {
var obj Obj
if err := json.Unmarshal([]byte(v), &obj); err == nil {
DoSomethingCool(obj) // 得到一个有效的JSON
}
}
}
英文:
If you can restrict your JSON objects to a single line, you just have to break the input by line and ignore what doesn't Unmarshal. Here's a small piece of code.
package main
import (
"encoding/json"
"fmt"
"strings"
)
var input = `{"command": "ack", "id": "1231231"}
end
{"command": "fail", "id": "1231231"}
end
`
type Obj struct {
Command string
Msg string
Id string
}
func DoSomethingCool(o Obj) {
// Do something cool here
fmt.Println(o)
}
func main() {
inputs := strings.Split(input, "\n")
for _, v := range inputs {
var obj Obj
if err := json.Unmarshal([]byte(v), &obj); err == nil {
DoSomethingCool(obj) // Got a valid JSON
}
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论