2013年10月7日 23:17:30go评论87阅读模式

英文:

How to decompress a []byte content in gzip format that gives an error when unmarshaling

问题

我正在请求一个API，从响应中获取一个[]byte（使用ioutil.ReadAll(resp.Body)）。我试图对这个内容进行解码，但似乎它不是以utf-8格式编码的，因为解码时会返回错误。我尝试了以下代码：

package main

import (
	"encoding/json"
	"fmt"

	"some/api"
)

func main() {
	content := api.SomeAPI.SomeRequest() // []byte变量
	var data interface{}
	err := json.Unmarshal(content, &data)
	if err != nil {
		panic(err.Error())
	}
	fmt.Println("来自响应的数据", data)
}

我得到的错误是invalid character '\x1f' looking for beginning of value。值得一提的是，响应头中包含了Content-Type:[application/json; charset=utf-8]。

在解码时，如何对content进行解码以避免出现这些无效字符？

编辑

这是content的十六进制转储：play.golang.org/p/oJ5mqERAmj

英文:

I'm making a request to an API, which with I get a []byte out of the response (ioutil.ReadAll(resp.Body)). I'm trying to unmarshal this content, but seems to be not encoded on utf-8 format, as unmarshal returns an error. I'm trying this to do so:

package main

import (
	&quot;encoding/json&quot;
	&quot;fmt&quot;

    &quot;some/api&quot;
)

func main() {
	content := api.SomeAPI.SomeRequest() // []byte variable
 	var data interface{}
	err := json.Unmarshal(content, &amp;data)
	if err != nil {
		panic(err.Error())
	}
	fmt.Println(&quot;Data from response&quot;, data)
}

I get as an error that invalid character '\x1f' looking for beginning of value. For the record, the response includes in the header that Content-Type:[application/json; charset=utf-8].

How can I decode content to avoid these invalid characters when unmarshaling?

Edit

This is the hexdump of content: play.golang.org/p/oJ5mqERAmj

答案1

得分: 13

根据你的十六进制转储，你正在接收gzip编码的数据，所以你需要先使用compress/gzip来解码它。

尝试像这样的代码：

package main

import (
	"bytes"
	"compress/gzip"
	"encoding/json"
	"fmt"
	"io"
	"some/api"
)

func main() {
	content := api.SomeAPI.SomeRequest() // []byte变量

	// 将内容解压缩为io.Reader
	buf := bytes.NewBuffer(content)
	reader, err := gzip.NewReader(buf)
	if err != nil {
		panic(err)
	}

	// 使用流接口从io.Reader解码json
	var data interface{}
	dec := json.NewDecoder(reader)
	err = dec.Decode(&data)
	if err != nil && err != io.EOF {
		panic(err)
	}
	fmt.Println("来自响应的数据", data)
}

之前的内容：

字符\x1f是ASCII和UTF-8中的单元分隔符字符。它从不是UTF-8编码的一部分，但可以用于标记不同的文本部分。带有\x1f的字符串可以是有效的UTF-8，但据我所知，不是有效的JSON。

我认为你需要仔细阅读API规范，以找出他们在使用\x1f标记的用途，但同时你可以尝试将它们删除，看看会发生什么，例如：

import (
	"bytes"
	"fmt"
)

func main() {
	b := []byte("hello\x1fGoodbye")
	fmt.Printf("b was %q\n", b)
	b = bytes.Replace(b, []byte{0x1f}, []byte{' '}, -1)
	fmt.Printf("b is now %q\n", b)
}

输出：

b was "hello\x1fGoodbye"
b is now "hello Goodbye"

Playground链接

英文:

Judging by your hex dump you are receiving gzip encoded data so you'll need to use compress/gzip to decode it first.

Try something like this

package main

import (
	&quot;bytes&quot;
	&quot;compress/gzip&quot;
	&quot;encoding/json&quot;
	&quot;fmt&quot;
	&quot;io&quot;
	&quot;some/api&quot;
)

func main() {
	content := api.SomeAPI.SomeRequest() // []byte variable

	// decompress the content into an io.Reader
	buf := bytes.NewBuffer(content)
	reader, err := gzip.NewReader(buf)
	if err != nil {
		panic(err)
	}

    // Use the stream interface to decode json from the io.Reader
	var data interface{}
   	dec := json.NewDecoder(reader)
	err = dec.Decode(&amp;data)
	if err != nil &amp;&amp; err != io.EOF {
		panic(err)
	}
	fmt.Println(&quot;Data from response&quot;, data)
}

Previous

Character \x1f is the unit separator character in ASCII and UTF-8. It is never part of an UTF-8 encoding, however can be used to mark off different bits of text. A string with an \x1f can valid UTF-8 but not valid json as far as I know.

I think you need to read the API specification closely to find out what they are using the \x1f markers for, but in the meantime you could try removing them and see what happens, eg

import (
	&quot;bytes&quot;
	&quot;fmt&quot;
)

func main() {
	b := []byte(&quot;hello\x1fGoodbye&quot;)
	fmt.Printf(&quot;b was %q\n&quot;, b)
	b = bytes.Replace(b, []byte{0x1f}, []byte{&#39; &#39;}, -1)
	fmt.Printf(&quot;b is now %q\n&quot;, b)
}

Prints

b was &quot;hello\x1fGoodbye&quot;
b is now &quot;hello Goodbye&quot;

Playground link

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何解压缩一个以gzip格式压缩的[]byte内容，在解组时出现错误。

问题

答案1

使用Golang Apache Arrow实现中datatype.go中指定的数据类型来构建模式。

Protobuffers和Golang – 将编组的结构写出并读取回来

在Atom中无法使用”go get”命令。

为什么在使用`aws-sdk-go-v2`时，无法从 DynamoDB 本地容器中查看表格？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论