S3 GetObject 返回 HTML 文件的奇怪编码

huangapple go评论68阅读模式
英文:

S3 GetObject Returning Weird Encoding for HTML file

问题

我试图将一个HTML文件读取到Lambda Edge函数中,然后在响应主体中返回它,但出于某种原因,我无法使其正确返回HTML文件的内容。

以下是我的代码(简化版):

const AWS = require('aws-sdk');

const S3 = new AWS.S3({
    signatureVersion: 'v4'
});

const { Body } = await S3.getObject({ Bucket: 'my-bucket', Key: 'index.html' }).promise();

console.log(Body.toString());

在控制台日志中,我看到的不是<html...,而是可怕的问号字符,这意味着(我认为)编码错误:

> ��Y�r#��x�`�,b�J�ٲ��NR�yIٮ�!!���"���n���޴��Is�>}�n4pF�_���de�nq~�]� f����v��۔*�㺮ȕ� Hdž�<�! �c�5�1B��,#|Ŵ;ֶ�U����z� Qi��j�0��V ���H���...etc

我已经尝试了包括但不限于以下的一切:

  • Body.toString('utf-8');
  • Body.toString('ascii');
  • Body.toString('base64');
  • decoder.write(Body.toString('base64'));
  • 还有很多其他方法...

我认为我肯定漏掉了某些非常明显的东西,因为我找不到其他人遇到同样的问题。我曾认为问题可能与加密有关,但我的其他Lambda Edge函数可以读取图像文件而没有问题,所以我认为这必须与编码有关,但我没有考虑到。

更新

我认为问题可能与文件被gzip压缩有关。

以下是来自S3的响应的打印:

{
  AcceptRanges: 'bytes',
  LastModified: 2023-02-17T19:44:41.000Z,
  ContentLength: 1598,
  ETag: 'some-key',
  CacheControl: 'max-age=31536000',
  ContentEncoding: 'gzip',
  ContentType: 'text/html; charset=UTF-8',
  ServerSideEncryption: 'AES256',
  Metadata: { etag: 'some-key' },
  Body: <Buffer 1f 8b 08 00 00 00 00 00 00 03 cd 59 db 72 23 b7 11 fd 15 78 f2 60 bb 2c 62 ee b7 8d c8 4a b2 d9 b2 b7 ca 4e 52 bb 79 49 d9 ae 14 06 e8 21 21 cd 0c a6 ... 1548 more bytes>
}

希望这对你有帮助。

英文:

I am attempting to read an HTML file into a Lambda Edge function and then return it in the response body, but for some reason, I cannot get it to return the contents of the HTML file correctly.

Here is my code (simplified):

const AWS = require(&#39;aws-sdk&#39;);

const S3 = new AWS.S3({
    signatureVersion: &#39;v4&#39;
});

const { Body } = await S3.getObject({ Bucket: &#39;my-bucket&#39;, Key: &#39;index.html&#39; }).promise();

console.log(Body.toString());

Instead of seeing &lt;html... in the console log, I am seeing the dreaded question mark characters which implies (I think), bad encoding:

> ��Y�r#��x�`�,b�J�ٲ��NR�yIٮ�!!���"���n���޴��Is�>}�n4pF�_���de�nq�~�]� f�����v��۔*�㺮Ý� Hdž�<�! �c�5�1B��,#|Ŵ;ֶ�U����z� �Qi��j�0��V ���H���...etc

I have literally tried everything including, but not limited to:

  • Body.toString(&#39;utf-8&#39;);
  • Body.toString(&#39;ascii&#39;);
  • Body.toString(&#39;base64&#39;);
  • decoder.write(Body.toString(&#39;base64&#39;));
  • and a lot more...

I think I must be missing something really obvious here as I cannot find anyone else facing the same issue. I thought it might be to do with the encryption but my other Lambda Edge function reads an image file without issues so I assume it has to be something to do with encoding that I haven't thought of.

UPDATE

I believe the issue may be related to the fact that the file is gzipped.

Here is a print of the response from S3:

{
  AcceptRanges: &#39;bytes&#39;,
  LastModified: 2023-02-17T19:44:41.000Z,
  ContentLength: 1598,
  ETag: &#39;some-key&#39;,
  CacheControl: &#39;max-age=31536000&#39;,
  ContentEncoding: &#39;gzip&#39;,
  ContentType: &#39;text/html; charset=UTF-8&#39;,
  ServerSideEncryption: &#39;AES256&#39;,
  Metadata: { etag: &#39;some-key&#39; },
  Body: &lt;Buffer 1f 8b 08 00 00 00 00 00 00 03 cd 59 db 72 23 b7 11 fd 15 78 f2 60 bb 2c 62 ee b7 8d c8 4a b2 d9 b2 b7 ca 4e 52 bb 79 49 d9 ae 14 06 e8 21 21 cd 0c a6 ... 1548 more bytes&gt;
}

答案1

得分: 0

以下是翻译好的部分:

问题是由于index.html文件被压缩而引起的。处理此问题的解决方案如下:

const AWS  = require('aws-sdk');
const zlib = require('zlib');

const S3 = new AWS.S3({
    signatureVersion: 'v4'
});

const { Body } = await S3.getObject({ Bucket: 'my-bucket', Key: 'index.html' }).promise();

const body = zlib.unzipSync(Body).toString();

console.log(body);

然而,考虑到这段代码将部署在Lambda Edge函数中,速度至关重要!因此,实际上更快的做法是利用https.get,因为它将在返回Buffer流之前自动解压文件的内容。

英文:

The issue was caused by the fact that the index.html file was gzipped. The solution to handling this is as follows:

const AWS  = require(&#39;aws-sdk&#39;);
const zlib = require(&#39;zlib&#39;);

const S3 = new AWS.S3({
    signatureVersion: &#39;v4&#39;
});

const { Body } = await S3.getObject({ Bucket: &#39;my-bucket&#39;, Key: &#39;index.html&#39; }).promise();

const body = zlib.unzipSync(Body).toString();

console.log(body);

However, given that this code is being deployed within a Lambda Edge function, speed is essential! With that in mind, it would/should be faster to actually utilise https.get as this will automatically unzip the contents of the file prior to returning the Buffer stream.

huangapple
  • 本文由 发表于 2023年2月19日 19:54:10
  • 转载请务必保留本文链接:https://go.coder-hub.com/75499951.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定