调试来自Golang的JSON错误

huangapple go评论89阅读模式
英文:

Debugging a JSON error from Golang

问题

我正在获取和解码一个包含错误的大型JSON响应。现在我需要找出错误出现的位置!我阅读了关于json.SyntaxError的内容,但我不知道如何使用它。

package main

import (
	"encoding/json"
	"fmt"
	"net/http"
	"os"
	"text/template"
	"time"
)

type Movie struct {
	Title       string    `json:"title"`
	PublishedAt time.Time `json:"published_at"`
}

func main() {
	req, _ := http.NewRequest("GET", "https://s.natalian.org/2016-12-07/debugme2.json", nil)
	resp, err := http.DefaultClient.Do(req)

	defer resp.Body.Close()
	dec := json.NewDecoder(resp.Body)

	_, err = dec.Token()
	for dec.More() {
		var m Movie
		if err = dec.Decode(&m); err != nil {
			fmt.Println(err)
			fmt.Println("Bad", m)

			// https://blog.golang.org/error-handling-and-go
			if serr, ok := err.(*json.SyntaxError); ok {
				fmt.Println("Syntax error", serr)
			}

		} else {
			fmt.Println("Good", m)
		}

		tmpl := template.Must(template.New("test").Parse("OUTPUT: {{ if .Title }}{{.Title}}{{ if .PublishedAt }} was published at {{.PublishedAt}} {{ end }}{{end}}\n"))
		tmpl.Execute(os.Stdout, m)
	}

}

我缺少什么?任何工具、策略或建议都将不胜感激。目前我的输出如下所示:

Good {foobar 2016-11-24 16:17:12 +0800 SGT}
OUTPUT: foobar was published at 2016-11-24 16:17:12 +0800 SGT
parsing time ""null"" as ""2006-01-02T15:04:05Z07:00"": cannot parse "null" as "2006"
Bad {barbar 0001-01-01 00:00:00 +0000 UTC}
OUTPUT: barbar was published at 0001-01-01 00:00:00 +0000 UTC
Good { 1999-12-24 16:11:12 +0200 +0200}
OUTPUT:
Good {Something else entirely 2000-01-24 16:11:12 +0200 +0200}
OUTPUT: Something else entirely was published at 2000-01-24 16:11:12 +0200 +0200

但是我需要类似以下的标准错误输出来更好地调试问题:

Line 8: published_at is invalid

也许还需要一些标题的上下文,这样我就可以告诉API后端团队他们的JSON响应中存在错误。

额外问题:此外,我不想打印值为_0001-01-01 00:00:00 +0000 UTC_,因为它实际上是空的。我实际上并不介意它缺失。

英文:

I'm fetching and decoding a large JSON response that has an error in it. Now I need to find where the error is! I read about json.SyntaxError but I am struggling to find out how to use it.

package main

import (
	"encoding/json"
	"fmt"
	"net/http"
	"os"
	"text/template"
	"time"
)

type Movie struct {
	Title       string    `json:"title"`
	PublishedAt time.Time `json:"published_at"`
}

func main() {
	req, _ := http.NewRequest("GET", "https://s.natalian.org/2016-12-07/debugme2.json", nil)
	resp, err := http.DefaultClient.Do(req)

	defer resp.Body.Close()
	dec := json.NewDecoder(resp.Body)

	_, err = dec.Token()
	for dec.More() {
		var m Movie
		if err = dec.Decode(&m); err != nil {
			fmt.Println(err)
			fmt.Println("Bad", m)

			// https://blog.golang.org/error-handling-and-go
			if serr, ok := err.(*json.SyntaxError); ok {
				fmt.Println("Syntax error", serr)
			}

		} else {
			fmt.Println("Good", m)
		}

		tmpl := template.Must(template.New("test").Parse("OUTPUT: {{ if .Title }}{{.Title}}{{ if .PublishedAt }} was published at {{.PublishedAt}} {{ end }}{{end}}\n"))
		tmpl.Execute(os.Stdout, m)
	}

}

What am I missing? Any tools or strategies or suggestions would be much appreciated. My output currently looks like:

Good {foobar 2016-11-24 16:17:12 +0800 SGT}
OUTPUT: foobar was published at 2016-11-24 16:17:12 +0800 SGT
parsing time ""null"" as ""2006-01-02T15:04:05Z07:00"": cannot parse "null"" as "2006"
Bad {barbar 0001-01-01 00:00:00 +0000 UTC}
OUTPUT: barbar was published at 0001-01-01 00:00:00 +0000 UTC
Good { 1999-12-24 16:11:12 +0200 +0200}
OUTPUT:
Good {Something else entirely 2000-01-24 16:11:12 +0200 +0200}
OUTPUT: Something else entirely was published at 2000-01-24 16:11:12 +0200 +0200

But I need something like this in my stderr to better debug the issue:

Line 8: published_at is invalid

And maybe some context of the Title so I can tell the API backend team they have an error in their JSON response.

BONUS question: Furthermore I don't want to print the value 0001-01-01 00:00:00 +0000 UTC as it's actually really empty. I don't actually mind it being missing.

答案1

得分: 5

我找到了一些解决方案:

if err := json.Unmarshal([]byte(data), &myStruct); err != nil {
    if jsonErr, ok := err.(*json.SyntaxError); ok {
        problemPart := data[jsonErr.Offset-10 : jsonErr.Offset+10]
        err = fmt.Errorf("%w ~ error near '%s' (offset %d)", err, problemPart, jsonErr.Offset)
    }
}

它会打印类似以下的内容:

invalid character 'n' after object key:value pair ~ error near 'rence","numberOfBil' (offset 14557)
英文:

I found some solution:

if err := json.Unmarshal([]byte(data), &myStruct); err != nil {
    if jsonErr, ok := err.(*json.SyntaxError); ok {
	    problemPart := data[jsonErr.Offset-10 : jsonErr.Offset+10]
	    err = fmt.Errorf("%w ~ error near '%s' (offset %d)", err, problemPart, jsonErr.Offset)
    }
}

It will print something like

invalid character 'n' after object key:value pair ~ error near 'rence\","numberOfBil' (offset 14557)

答案2

得分: 4

一种既接受空值又在published_at为空时不打印任何内容的方法是将PublishedAt字段设置为指针值:

type Movie struct {
    Title       string     `json:"title"`
    PublishedAt *time.Time `json:"published_at"`
}

输入字符串是有效的JSON,因此json包不会引发SyntaxError

json包还有其他错误类型,例如UnmarshalTypeError,当JSON与内置类型(例如:stringintarray等)不匹配时引发该错误。

不幸的是,当调用自定义的UnmarshalJSON()函数时,json包似乎返回原始错误:

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"time"
)

// 检查在解组JSON字符串时引发的错误的完整类型
func main() {
	var test struct {
		Clock time.Time
	}
	buf := bytes.NewBufferString(`{"Clock":null}`)
	dec := json.NewDecoder(buf)

	// 要求将无效的null值解组到一个普通的time.Time字段中:
	err := dec.Decode(&test)

	// 打印返回的错误的详细信息:
	fmt.Printf("%#v\n", err)
}

// 输出:
// &time.ParseError{Layout:"\"2006-01-02T15:04:05Z07:00\"", Value:"null", LayoutElem:"\"", ValueElem:"null", Message:""}

最终的错误直接来自time包,它不是json包的某种UnmarshalError,后者至少可以告诉您“在尝试解组此偏移处的值时发生了错误”,而仅有错误本身无法提供上下文信息。


您可以在错误中特别查找类型为*time.ParseError的错误:

if terr, ok := err.(*time.ParseError); ok {
    // 在示例中:Movie只有一个time.Time字段;
    // 如果发生了time.ParseError,那么就是在尝试读取该字段时发生的错误
    fmt.Println("尝试读取'published_at'值时发生错误", terr)

    // 您可以将字段保留为其零值,
    // 或者如果您切换到了指针字段:
    m.PublishedAt = nil
}

如果您恰好有多个时间字段(例如:ProducedAtPublishedAt),您仍然可以查看哪个字段保持其零值:

if terr, ok := err.(*time.ParseError); ok {
    if m.ProducedAt.IsZero() {
        fmt.Println("尝试读取'produced_at'值时发生错误", terr)
    }

    if m.PublishedAt == zero {
        fmt.Println("尝试读取'published_at'值时发生错误", terr)
    }
}

顺便说一下:如文档中所指定的,"0001-01-01 00:00:00 UTC"是Go团队选择的go的time.Time零值。

英文:

One way to both accept null values, and to not print anything if published_at is null, is to set PublishedAt field to a pointer value :

type Movie struct {
    Title       string    `json:"title"`
    PublishedAt *time.Time `json:"published_at"`
}

The input string is valid JSON, so the json package does not raise a SyntaxError.

The json package has some other error types, such as UnmarshalTypeError, which is raised when an error occurs when the json does not match a nuilt-in type (e.g : string, int, array ...).

Unfortunately, when it calls a custom UnmarshalJSON() function, it looks like the json package returns the raw error :

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	"time"
)

// check the full type of an error raised when Unmarshaling a json string
func main() {
	var test struct {
		Clock time.Time
	}
	buf := bytes.NewBufferString(`{"Clock":null}`)
	dec := json.NewDecoder(buf)

	// ask to decode an invalid null value into a flat time.Time field :
	err := dec.Decode(&test)
	
	// print the details of the returned error :
	fmt.Printf("%#v\n", err)
}

// Output :
&time.ParseError{Layout:"\"2006-01-02T15:04:05Z07:00\"", Value:"null", LayoutElem:"\"", ValueElem:"null", Message:""}

https://play.golang.org/p/fhZxVpOflb

The final error comes straight from the time package, it is not some kind of UnmarshalError from the json package which could at least tell you "this error occured when trying to Unmarshal value at this offset", and the error alone will not give you the context.


You can look specifically for type *time.ParseError in the error :

if terr, ok := err.(*time.ParseError); ok {
    // in the example : Movie has one single time.Time field ;
    // if a time.ParseError occured, it was while trying to read that field
    fmt.Println("Error when trying to read 'published_at' value", terr)

    // you can leave the field to its zero value,
    // or if you switched to a pointer field :
    m.PublishedAt = nil
}

If you happen to have several time fields (e.g : ProducedAt and PublishedAt), you can still look which field was left with its zero value :

if terr, ok := err.(*time.ParseError); ok {
    if m.ProducedAt.IsZero() {
        fmt.Println("Error when trying to read 'produced_at' value", terr)
    }

    if m.PublishedAt == zero {
        fmt.Println("Error when trying to read 'published_at' value", terr)
    }
}

By the way : as specified in the docs, "0001-01-01 00:00:00 UTC" is the zero value that the go team chose for go's time.Time zero value.

答案3

得分: 0

你的published_at字段的数据是"null",它是一个字符串类型,所以我认为你可以将PublishedAt定义为字符串,并使用代码将其解析为time.Time类型。

这是我的测试代码:

package main

import (
	"encoding/json"

	"github.com/swanwish/go-common/logs"
	"github.com/swanwish/go-common/utils"
)

func main() {
	url := `https://s.natalian.org/2016-12-07/debugme2.json`
	_, content, err := utils.GetUrlContent(url)
	if err != nil {
		logs.Errorf("从URL %s 获取内容失败,错误信息:%v", url, err)
		return
	}

	movies := []struct {
		Title       string `json:"title"`
		PublishedAt string `json:"published_at"`
	}{}
	err = json.Unmarshal(content, &movies)
	if err != nil {
		logs.Errorf("解析内容失败:%s,错误信息:%v", string(content), err)
		return
	}
	logs.Debugf("电影列表:%v", movies)
}

结果是:

电影列表:[{foobar 2016-11-24T16:17:12.000+08:00} {barbar null} { 1999-12-24T16:11:12.000+02:00} {Something else entirely 2000-01-24T16:11:12.000+02:00}]
英文:

Your data for published_at is "null", it is string type, so I think you can define the PublishedAt as string, and you can use code to parse it to time.Time.

This is my test code:

package main

import (
	"encoding/json"

	"github.com/swanwish/go-common/logs"
	"github.com/swanwish/go-common/utils"
)

func main() {
	url := `https://s.natalian.org/2016-12-07/debugme2.json`
	_, content, err := utils.GetUrlContent(url)
	if err != nil {
		logs.Errorf("Failed to get content from url %s, the error is %v", url, err)
		return
	}

	movies := []struct {
		Title       string `json:"title"`
		PublishedAt string `json:"published_at"`
	}{}
	err = json.Unmarshal(content, &movies)
	if err != nil {
		logs.Errorf("Failed to unmarshal content %s, the error is %v", string(content), err)
		return
	}
	logs.Debugf("The movies are %v", movies)
}

The result is:

The movies are [{foobar 2016-11-24T16:17:12.000+08:00} {barbar null} { 1999-12-24T16:11:12.000+02:00} {Something else entirely 2000-01-24T16:11:12.000+02:00}]

答案4

得分: 0

看起来有点疯狂,但应该能够工作:

rawBody := []byte(`{"title":"test", "published_at":"2017-08-05T15:04:05Z", "edited_at":"05.08.2017"}`)

type Movie struct {
   Title       string    `json:"title"`
   PublishedAt time.Time `json:"published_at"`
   EditedAt    time.Time `json:"edited_at"`
}

var msg Movie 

if err = json.Unmarshal(rawBody, &msg); err != nil {
    if _, ok := err.(*time.ParseError); ok {
        value := reflect.ValueOf(msg).Elem()

        if value.Kind().String() != "struct" {
            return err
        }

        for i := 0; i < value.NumField(); i++ {
            field := value.Field(i)

            if t, ok := field.Interface().(time.Time); ok {
                if t.IsZero() {
                    name := value.Type().Field(i).Name
                    return fmt.Errorf("field: %s, message: %s", strings.ToLower(name), "time is not in RFC 3339 format.")
                }
            }
        }
    }

    return err
}

这段代码将返回第一个发生的错误。如果PublishedAt无效,即使EditedAt有效,我们也无法得知。

英文:

It looks like madness, but it should work:

rawBody := []byte(`{&quot;title&quot;:&quot;test&quot;, &quot;published_at&quot;:&quot;2017-08-05T15:04:05Z&quot;, &quot;edited_at&quot;:&quot;05.08.2017&quot;}`)

type Movie struct {
   Title       string    `json:&quot;title&quot;`
   PublishedAt time.Time `json:&quot;published_at&quot;`
   EditedAt    time.Time `json:&quot;edited_at&quot;`
}

var msg Movie 

if err = json.Unmarshal(rawBody, &amp;msg); err != nil {
    if _, ok := err.(*time.ParseError); ok {
	    value := reflect.ValueOf(msg).Elem()

	    if value.Kind().String() != &quot;struct&quot; {
		    return err
	    }

        for i := 0; i &lt; value.NumField(); i++ {
		    field := value.Field(i)

		    if t, ok := field.Interface().(time.Time); ok {
			    if t.IsZero() {
				    name := value.Type().Field(i).Name
				    return fmt.Errorf(&quot;field: %s, message: %s&quot;, strings.ToLower(name), &quot;time is not in RFC 3339 format.&quot;)
			    }
		    }
        }
    }

    return err
}

This code will return first error happened. If PublishedAt is invalid we will know nothing about EditedAt even if it is valid.

huangapple
  • 本文由 发表于 2016年12月7日 16:06:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/41012247.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定