通过gzip解压缩后出现未知字符的问题。

huangapple go评论103阅读模式
英文:

Unknown characters showing up in decompressed string via gzip

问题

我有一个移动应用程序,我正在通过gzip压缩JSON字符串并将数据存储在数据库中。我还有一个Web应用程序,用于解压相同的JSON字符串以在Web页面上显示数据。问题是,当在Web应用程序中解压缩时,JSON字符串似乎会出现一些未知字符,而在移动应用程序中解压缩时看不到这些字符。

Android应用程序是用Java编写的,并使用以下代码来压缩字符串:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzipOut = null;
try {
    gzipOut = new GZIPOutputStream(baos);
    ObjectOutputStream objectOut = new ObjectOutputStream(gzipOut);
    objectOut.writeObject(jsonData);
    objectOut.close();

    // 字节数组的包装器
    ServerData nData = new ServerData();
    nData.data = baos.toByteArray();
    String finalData = JSONObjectStringConverter.json.toJson(nData);
    return finalData;
} catch (IOException e) {
    e.printStackTrace();
}

上述代码似乎能够正确工作,将字节数组存储在包装器中,然后存储到数据库中。

Web应用程序使用Node.js后端,并使用以下代码从数据库中检索数据并从包装器中解压缩数据:

try {
   // 将字节数组转换回JSON
    const decompressedData = zlib.gunzipSync(new Uint8Array(compressedByteArray));
    jsonData = decompressedData.toString();
} catch (e) {
    console.error(e)
}

再次强调,这大部分似乎都能正常工作,但在字符串的开头有一些未识别的字符:

"��\u0000\u0005t��{\"cloudData\": ..."

我认为这可能与压缩字符串的标头有关(即字节数组中的前10个字节):

[31, -117, 8, 0, 0, 0, 0, 0, 0, 0,...

但我在那方面没有取得太多进展。是否有人有其他关于问题可能是什么的建议?

英文:

I have a mobile app where I am compressing a JSON string via gzip and storing the data in a database. I also have a web app that decompresses that same JSON string to display data on a web page. The problem is that the JSON string seems to have some unknown characters show up when decompressed in the web app that are not seen when decompressing in the mobile app.

The Android app is written in Java and uses the following code to compress the string:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzipOut = null;
try {
    gzipOut = new GZIPOutputStream(baos);
    ObjectOutputStream objectOut = new ObjectOutputStream(gzipOut);
    objectOut.writeObject(jsonData);
    objectOut.close();

    // Wrapper for the byte array
    ServerData nData = new ServerData();
    nData.data = baos.toByteArray();
    String finalData = JSONObjectStringConverter.json.toJson(nData);
    return finalData;
} catch (IOException e) {
    e.printStackTrace();
}

The above seems to work correctly, storing the byte array within the wrapper and then into the database.

The web app uses a nodejs backend and uses the following code to decompress the data after it is retrieved from the database and removed from the wrapper:

try{
   // Convert the byte array back to JSON 
    const decompressedData = zlib.gunzipSync(new Uint8Array(compressedByteArray) );
    jsonData = optionDataInflated.toString();
}catch(e){
    console.error(e)
}

Again, this mostly seems to work but there are some unidentified characters at the beginning of the string:

"��\u0000\u0005t��{\"cloudData\": ..."

I thought it may have something to do with the header of the compressed string (i.e. the first 10 bytes in the byte array):

[31, -117, 8, 0, 0, 0, 0, 0, 0, 0,...

but I wasn't able to make much progress on that end. Does anyone have any other suggestions as to what the problem could be?

答案1

得分: 1

ObjectOutputStream是Java通用对象序列化机制的一部分,你会发现在nodejs中让它正常工作很困难。

摒弃对象流,直接将JSON写入gzip流。假设jsonData是一个字符串,使用以下代码:

gzipOut.write(jsonData.getBytes("UTF8"));
gzipOut.close();
英文:

ObjectOutputStream is part of a general Java object serialization mechanism, and you're going to have a hard time making it work with nodejs.

Get rid of the object stream and write your JSON directly to the gzip stream. Assuming jsonData is a string, use:

gzipOut.write(jsonData.getBytes("UTF8"));
gzipOut.close();

</details>



huangapple
  • 本文由 发表于 2020年8月3日 10:29:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/63222970.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定